{"id":46736,"date":"2022-11-30T00:00:00","date_gmt":"2022-11-30T08:00:00","guid":{"rendered":"https:\/\/www.griddb.net\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/"},"modified":"2025-11-13T12:56:21","modified_gmt":"2025-11-13T20:56:21","slug":"studying-and-forecasting-web-traffic-using-python-and-griddb","status":"publish","type":"post","link":"https:\/\/www.griddb.net\/en\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/","title":{"rendered":"Studying and Forecasting Web Traffic using Python and GridDB"},"content":{"rendered":"<p>In the last few years, the time-series database category has experienced the fastest growth. Both established and emerging technology sectors have been producing an increasing amount of time-series data.<\/p>\n<p>The quantity of sessions in a given period of time is known as web traffic, and it varies greatly depending on the time of day, day of the week, and other factors. The amount of web traffic a platform can handle is determined by the size of the servers that host the platform.<\/p>\n<p>Based on historical visitor volume data or historical web traffic data, you can dynamically allocate many servers. And that brings us to the data science challenge, which is basically analysing and forecasting the volume of sessions or web traffic based on past data.<\/p>\n<p>The outline of the tutorial is as follows:<\/p>\n<ol>\n<li>Dataset overview<\/li>\n<li>Importing required libraries<\/li>\n<li>Loading the dataset<\/li>\n<li>Analysing with data visualization<\/li>\n<li>Forecasting<\/li>\n<li>Conclusion<\/li>\n<\/ol>\n<h2>Prerequisites and Environment setup<\/h2>\n<p>The full jupyter file and be found in our GitHub repo:<\/p>\n<p>$ git clone &#8211;branch web-forecasting https:\/\/github.com\/griddbnet\/Blogs.git<\/p>\n<p>This tutorial is carried out in Anaconda Navigator (Python version \u2013 3.8.5) on Windows Operating System. The following packages need to be installed before you continue with the tutorial \u2013<\/p>\n<ol>\n<li>\n<p>Pandas<\/p>\n<\/li>\n<li>\n<p>NumPy<\/p>\n<\/li>\n<li>\n<p>re<\/p>\n<\/li>\n<li>\n<p>Matplotlib<\/p>\n<\/li>\n<li>\n<p>Seaborn<\/p>\n<\/li>\n<li>\n<p>griddb_python<\/p>\n<\/li>\n<li>\n<p>fbprophet<\/p>\n<\/li>\n<\/ol>\n<p>You can install these packages in Conda\u2019s virtual environment using <code>conda install package-name<\/code>. In case you are using Python directly via terminal\/command prompt, <code>pip install package-name<\/code> will do the work.<\/p>\n<h3>GridDB Installation<\/h3>\n<p>While loading the dataset, this tutorial will cover two methods \u2013 Using GridDB as well as Using Pandas. To access GridDB using Python, the following packages also need to be installed beforehand:<\/p>\n<ol>\n<li><a href=\"https:\/\/github.com\/griddb\/c_client\">GridDB C-client<\/a><\/li>\n<li>SWIG (Simplified Wrapper and Interface Generator)<\/li>\n<li><a href=\"https:\/\/github.com\/griddb\/python_client\">GridDB Python Client<\/a><\/li>\n<\/ol>\n<h2>1&#46; Dataset Overview<\/h2>\n<p>The dataset consists of approximately 145k time series. Each of these time series represent a number of daily views of a different Wikipedia article, starting from July, 1st, 2015 up until December 31st, 2016.<\/p>\n<p>https:\/\/www.kaggle.com\/competitions\/web-traffic-time-series-forecasting\/data<\/p>\n<h2>2&#46; Importing Required Libraries<\/h2>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">%matplotlib inline\nimport pandas as pd\nimport numpy as np\nimport re\nimport seaborn as sns\nfrom fbprophet import Prophet\nimport griddb_python as griddb\n\nimport matplotlib.pyplot as plt\nplt.style.use('fivethirtyeight')\n\nimport warnings\nwarnings.filterwarnings(\"ignore\")<\/code><\/pre>\n<\/div>\n<h2>3&#46; Loading the Dataset<\/h2>\n<p>You can download the dataset for yourself here: https:\/\/www.kaggle.com\/competitions\/web-traffic-time-series-forecasting\/data?select=train_1.csv.zip<\/p>\n<p>Let\u2019s proceed and load the dataset into our notebook.<\/p>\n<h3>3&#46;a Using GridDB<\/h3>\n<p>Toshiba GridDB\u2122 is a highly scalable NoSQL database best suited for IoT and Big Data. The foundation of GridDB\u2019s principles is based upon offering a versatile data store that is optimized for IoT, provides high scalability, tuned for high performance, and ensures high reliability.<\/p>\n<p>To store large amounts of data, a CSV file can be cumbersome. GridDB serves as a perfect alternative as it in open-source and a highly scalable database. GridDB is a scalable, in-memory, No SQL database which makes it easier for you to store large amounts of data. If you are new to GridDB, a tutorial on <a href=\"https:\/\/griddb.net\/en\/blog\/using-pandas-dataframes-with-griddb\/\">reading and writing to GridDB<\/a> can be useful.<\/p>\n<p>Assuming that you have already set up your database, we will now write the SQL query in python to load our dataset.<\/p>\n<p>The read_sql_query function offered by the pandas library converts the data fetched into a panda data frame to make it easy for the user to work.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">sql_statement = ('SELECT * FROM train_1.csv)\ndf1 = pd.read_sql_query(sql_statement, cont)<\/code><\/pre>\n<\/div>\n<p>Note that the <code>cont<\/code> variable has the container information where our data is stored. Replace the <code>credit_card_dataset<\/code> with the name of your container. More info can be found in this tutorial <a href=\"https:\/\/griddb.net\/en\/blog\/using-pandas-dataframes-with-griddb\/\">reading and writing to GridDB<\/a>.<\/p>\n<p>When it comes to IoT and Big Data use cases, GridDB clearly stands out among other databases in the Relational and NoSQL space. Overall, GridDB offers multiple reliability features for mission-critical applications that require high availability and data retention.<\/p>\n<h3>3&#46;b Using pandas read_csv<\/h3>\n<p>We can also use Pandas&#8217; <code>read_csv<\/code> function to load our data. Both of the above methods will lead to the same output as the data is loaded in the form of a pandas dataframe using either of the methods.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">df = pd.read_csv('train_1.csv', parse_dates=True) <\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">df.head()<\/code><\/pre>\n<\/div>\n<div  style=\"overflow-x: scroll;overflow-y: hidden;\">\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n        vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n        text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th>\n        <\/th>\n<th>\n          Page\n        <\/th>\n<th>\n          2015-07-01\n        <\/th>\n<th>\n          2015-07-02\n        <\/th>\n<th>\n          2015-07-03\n        <\/th>\n<th>\n          2015-07-04\n        <\/th>\n<th>\n          2015-07-05\n        <\/th>\n<th>\n          2015-07-06\n        <\/th>\n<th>\n          2015-07-07\n        <\/th>\n<th>\n          2015-07-08\n        <\/th>\n<th>\n          2015-07-09\n        <\/th>\n<th>\n          &#8230;\n        <\/th>\n<th>\n          2016-12-22\n        <\/th>\n<th>\n          2016-12-23\n        <\/th>\n<th>\n          2016-12-24\n        <\/th>\n<th>\n          2016-12-25\n        <\/th>\n<th>\n          2016-12-26\n        <\/th>\n<th>\n          2016-12-27\n        <\/th>\n<th>\n          2016-12-28\n        <\/th>\n<th>\n          2016-12-29\n        <\/th>\n<th>\n          2016-12-30\n        <\/th>\n<th>\n          2016-12-31\n        <\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>\n          0\n        <\/th>\n<td>\n          2NE1_zh.wikipedia.org_all-access_spider\n        <\/td>\n<td>\n          18.0\n        <\/td>\n<td>\n          11.0\n        <\/td>\n<td>\n          5.0\n        <\/td>\n<td>\n          13.0\n        <\/td>\n<td>\n          14.0\n        <\/td>\n<td>\n          9.0\n        <\/td>\n<td>\n          9.0\n        <\/td>\n<td>\n          22.0\n        <\/td>\n<td>\n          26.0\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          32.0\n        <\/td>\n<td>\n          63.0\n        <\/td>\n<td>\n          15.0\n        <\/td>\n<td>\n          26.0\n        <\/td>\n<td>\n          14.0\n        <\/td>\n<td>\n          20.0\n        <\/td>\n<td>\n          22.0\n        <\/td>\n<td>\n          19.0\n        <\/td>\n<td>\n          18.0\n        <\/td>\n<td>\n          20.0\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          1\n        <\/th>\n<td>\n          2PM_zh.wikipedia.org_all-access_spider\n        <\/td>\n<td>\n          11.0\n        <\/td>\n<td>\n          14.0\n        <\/td>\n<td>\n          15.0\n        <\/td>\n<td>\n          18.0\n        <\/td>\n<td>\n          11.0\n        <\/td>\n<td>\n          13.0\n        <\/td>\n<td>\n          22.0\n        <\/td>\n<td>\n          11.0\n        <\/td>\n<td>\n          10.0\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          17.0\n        <\/td>\n<td>\n          42.0\n        <\/td>\n<td>\n          28.0\n        <\/td>\n<td>\n          15.0\n        <\/td>\n<td>\n          9.0\n        <\/td>\n<td>\n          30.0\n        <\/td>\n<td>\n          52.0\n        <\/td>\n<td>\n          45.0\n        <\/td>\n<td>\n          26.0\n        <\/td>\n<td>\n          20.0\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          2\n        <\/th>\n<td>\n          3C_zh.wikipedia.org_all-access_spider\n        <\/td>\n<td>\n          1.0\n        <\/td>\n<td>\n          0.0\n        <\/td>\n<td>\n          1.0\n        <\/td>\n<td>\n          1.0\n        <\/td>\n<td>\n          0.0\n        <\/td>\n<td>\n          4.0\n        <\/td>\n<td>\n          0.0\n        <\/td>\n<td>\n          3.0\n        <\/td>\n<td>\n          4.0\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          3.0\n        <\/td>\n<td>\n          1.0\n        <\/td>\n<td>\n          1.0\n        <\/td>\n<td>\n          7.0\n        <\/td>\n<td>\n          4.0\n        <\/td>\n<td>\n          4.0\n        <\/td>\n<td>\n          6.0\n        <\/td>\n<td>\n          3.0\n        <\/td>\n<td>\n          4.0\n        <\/td>\n<td>\n          17.0\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          3\n        <\/th>\n<td>\n          4minute_zh.wikipedia.org_all-access_spider\n        <\/td>\n<td>\n          35.0\n        <\/td>\n<td>\n          13.0\n        <\/td>\n<td>\n          10.0\n        <\/td>\n<td>\n          94.0\n        <\/td>\n<td>\n          4.0\n        <\/td>\n<td>\n          26.0\n        <\/td>\n<td>\n          14.0\n        <\/td>\n<td>\n          9.0\n        <\/td>\n<td>\n          11.0\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          32.0\n        <\/td>\n<td>\n          10.0\n        <\/td>\n<td>\n          26.0\n        <\/td>\n<td>\n          27.0\n        <\/td>\n<td>\n          16.0\n        <\/td>\n<td>\n          11.0\n        <\/td>\n<td>\n          17.0\n        <\/td>\n<td>\n          19.0\n        <\/td>\n<td>\n          10.0\n        <\/td>\n<td>\n          11.0\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          4\n        <\/th>\n<td>\n          52_Hz_I_Love_You_zh.wikipedia.org_all-access_s&#8230;\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          48.0\n        <\/td>\n<td>\n          9.0\n        <\/td>\n<td>\n          25.0\n        <\/td>\n<td>\n          13.0\n        <\/td>\n<td>\n          3.0\n        <\/td>\n<td>\n          11.0\n        <\/td>\n<td>\n          27.0\n        <\/td>\n<td>\n          13.0\n        <\/td>\n<td>\n          36.0\n        <\/td>\n<td>\n          10.0\n        <\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>\n    5 rows \u00d7 551 columns\n  <\/p>\n<\/div>\n<h3>4&#46; Analysing with data visualization<\/h3>\n<p>Is Traffic Influenced by Page Language?<\/p>\n<p>How the various languages used in Wikipedia might affect the dataset is one thing that might be interesting to examine. I&#8217;ll search for the language code in the wikipedia URL using a straightforward regular expression.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">train_1 = df<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">def get_language(page):\n    res = re.search('[a-z][a-z].wikipedia.org',page)\n    if res:\n        return res[0][0:2]\n    return 'na'\n\ntrain_1['lang'] = train_1.Page.map(get_language)\n\nfrom collections import Counter\n\nprint(Counter(train_1.lang))<\/code><\/pre>\n<\/div>\n<pre><code>Counter({'en': 24108, 'ja': 20431, 'de': 18547, 'na': 17855, 'fr': 17802, 'zh': 17229, 'ru': 15022, 'es': 14069})\n<\/code><\/pre>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">lang_sets = {}\nlang_sets['en'] = train_1[train_1.lang=='en'].iloc[:,0:-1]\nlang_sets['ja'] = train_1[train_1.lang=='ja'].iloc[:,0:-1]\nlang_sets['de'] = train_1[train_1.lang=='de'].iloc[:,0:-1]\nlang_sets['na'] = train_1[train_1.lang=='na'].iloc[:,0:-1]\nlang_sets['fr'] = train_1[train_1.lang=='fr'].iloc[:,0:-1]\nlang_sets['zh'] = train_1[train_1.lang=='zh'].iloc[:,0:-1]\nlang_sets['ru'] = train_1[train_1.lang=='ru'].iloc[:,0:-1]\nlang_sets['es'] = train_1[train_1.lang=='es'].iloc[:,0:-1]\n\nsums = {}\nfor key in lang_sets:\n    sums[key] = lang_sets[key].iloc[:,1:].sum(axis=0) \/ lang_sets[key].shape[0]<\/code><\/pre>\n<\/div>\n<p>So then how does the total number of views change over time? I&#8217;ll plot all the different sets on the same plot.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">days = [r for r in range(sums['en'].shape[0])]\n\nfig = plt.figure(1,figsize=[13,8])\nplt.ylabel('Views per Page')\nplt.xlabel('Day')\nplt.title('Pages in Different Languages')\nlabels={'en':'English','ja':'Japanese','de':'German',\n        'na':'Media','fr':'French','zh':'Chinese',\n        'ru':'Russian','es':'Spanish'\n       }\n\nfor key in sums:\n    plt.plot(days,sums[key],label = labels[key] )\n    \nplt.legend()\nplt.show()<\/code><\/pre>\n<\/div>\n<p><a href=\"https:\/\/griddb.net\/wp-content\/uploads\/2022\/11\/output_27_0.png\"><img fetchpriority=\"high\" decoding=\"async\" src=\"https:\/\/griddb.net\/wp-content\/uploads\/2022\/11\/output_27_0.png\" alt=\"\" width=\"708\" height=\"420\" class=\"aligncenter size-full wp-image-28931\" \/><\/a><\/p>\n<p>English shows a much higher number of views per page, as might be expected since Wikipedia is a US-based site.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">df1 = df.T\ndf1 = df1.reset_index()<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">df1.head()<\/code><\/pre>\n<\/div>\n<div style=\"overflow-x: scroll;overflow-y: hidden;\">\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n        vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n        text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th>\n        <\/th>\n<th>\n          Date\n        <\/th>\n<th>\n          2NE1_zh.wikipedia.org_all-access_spider\n        <\/th>\n<th>\n          2PM_zh.wikipedia.org_all-access_spider\n        <\/th>\n<th>\n          3C_zh.wikipedia.org_all-access_spider\n        <\/th>\n<th>\n          4minute_zh.wikipedia.org_all-access_spider\n        <\/th>\n<th>\n          52_Hz_I_Love_You_zh.wikipedia.org_all-access_spider\n        <\/th>\n<th>\n          5566_zh.wikipedia.org_all-access_spider\n        <\/th>\n<th>\n          91Days_zh.wikipedia.org_all-access_spider\n        <\/th>\n<th>\n          A&#8217;N&#8217;D_zh.wikipedia.org_all-access_spider\n        <\/th>\n<th>\n          AKB48_zh.wikipedia.org_all-access_spider\n        <\/th>\n<th>\n          &#8230;\n        <\/th>\n<th>\n          Drake_(m\u00fasico)_es.wikipedia.org_all-access_spider\n        <\/th>\n<th>\n          Skam_(serie_de_televisi\u00f3n)_es.wikipedia.org_all-access_spider\n        <\/th>\n<th>\n          Legi\u00f3n_(serie_de_televisi\u00f3n)_es.wikipedia.org_all-access_spider\n        <\/th>\n<th>\n          Doble_tentaci\u00f3n_es.wikipedia.org_all-access_spider\n        <\/th>\n<th>\n          Mi_adorable_maldici\u00f3n_es.wikipedia.org_all-access_spider\n        <\/th>\n<th>\n          Underworld_(serie_de_pel\u00edculas)_es.wikipedia.org_all-access_spider\n        <\/th>\n<th>\n          Resident_Evil:_Cap\u00edtulo_Final_es.wikipedia.org_all-access_spider\n        <\/th>\n<th>\n          Enamor\u00e1ndome_de_Ram\u00f3n_es.wikipedia.org_all-access_spider\n        <\/th>\n<th>\n          Hasta_el_\u00faltimo_hombre_es.wikipedia.org_all-access_spider\n        <\/th>\n<th>\n          Francisco_el_matem\u00e1tico_(serie_de_televisi\u00f3n_de_2017)_es.wikipedia.org_all-access_spider\n        <\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>\n          1\n        <\/th>\n<td>\n          2015-07-01\n        <\/td>\n<td>\n          18.0\n        <\/td>\n<td>\n          11.0\n        <\/td>\n<td>\n          1.0\n        <\/td>\n<td>\n          35.0\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          12.0\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          118.0\n        <\/td>\n<td>\n          5.0\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          2\n        <\/th>\n<td>\n          2015-07-02\n        <\/td>\n<td>\n          11.0\n        <\/td>\n<td>\n          14.0\n        <\/td>\n<td>\n          0.0\n        <\/td>\n<td>\n          13.0\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          7.0\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          26.0\n        <\/td>\n<td>\n          23.0\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          3\n        <\/th>\n<td>\n          2015-07-03\n        <\/td>\n<td>\n          5.0\n        <\/td>\n<td>\n          15.0\n        <\/td>\n<td>\n          1.0\n        <\/td>\n<td>\n          10.0\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          4.0\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          30.0\n        <\/td>\n<td>\n          14.0\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          4\n        <\/th>\n<td>\n          2015-07-04\n        <\/td>\n<td>\n          13.0\n        <\/td>\n<td>\n          18.0\n        <\/td>\n<td>\n          1.0\n        <\/td>\n<td>\n          94.0\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          5.0\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          24.0\n        <\/td>\n<td>\n          12.0\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          5\n        <\/th>\n<td>\n          2015-07-05\n        <\/td>\n<td>\n          14.0\n        <\/td>\n<td>\n          11.0\n        <\/td>\n<td>\n          0.0\n        <\/td>\n<td>\n          4.0\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          20.0\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          29.0\n        <\/td>\n<td>\n          9.0\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>\n    5 rows \u00d7 145064 columns\n  <\/p>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">df1=df1[:550]<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">column_header = df1.iloc[0,:].values\n\ndf1.columns = column_header\n    \ndf1 = df1.drop(0, axis = 0)\ndf1 = df1.rename(columns = {\"Page\" : \"Date\"})<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">df1[\"Date\"] = pd.to_datetime(df1[\"Date\"], format='%Y-%m-%d')\ndf1 = df1.set_index(\"Date\")<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\"># Finding number of access types and agents\naccess_types = []\nagents = []\nfor column in df1.columns:\n    access_type = column.split(\"_\")[-2]\n    agent = column.split(\"_\")[-1]\n    access_types.append(access_type)\n    agents.append(agent)<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\"># Counting access types\nfrom collections import Counter\naccess_dict = Counter(access_types)\naccess_dict<\/code><\/pre>\n<\/div>\n<pre><code>Counter({'all-access': 74315, 'desktop': 34809, 'mobile-web': 35939})\n<\/code><\/pre>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">access_df = pd.DataFrame({\"Access type\" : access_dict.keys(),\n                          \"Number of columns\" : access_dict.values()})\naccess_df<\/code><\/pre>\n<\/div>\n<div  style=\"overflow-x: scroll;overflow-y: hidden;\">\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n        vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n        text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th>\n        <\/th>\n<th>\n          Access type\n        <\/th>\n<th>\n          Number of columns\n        <\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>\n          0\n        <\/th>\n<td>\n          all-access\n        <\/td>\n<td>\n          74315\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          1\n        <\/th>\n<td>\n          desktop\n        <\/td>\n<td>\n          34809\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          2\n        <\/th>\n<td>\n          mobile-web\n        <\/td>\n<td>\n          35939\n        <\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">agents_dict = Counter(agents)\nagents_dict<\/code><\/pre>\n<\/div>\n<pre><code>Counter({'spider': 34913, 'all-agents': 110150})\n<\/code><\/pre>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">agents_df = pd.DataFrame({\"Agent\" : agents_dict.keys(),\n                          \"Number of columns\" : agents_dict.values()})\nagents_df<\/code><\/pre>\n<\/div>\n<div  style=\"overflow-x: scroll;overflow-y: hidden;\">\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n        vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n        text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th>\n        <\/th>\n<th>\n          Agent\n        <\/th>\n<th>\n          Number of columns\n        <\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>\n          0\n        <\/th>\n<td>\n          spider\n        <\/td>\n<td>\n          34913\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          1\n        <\/th>\n<td>\n          all-agents\n        <\/td>\n<td>\n          110150\n        <\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">df1.columns[86543].split(\"_\")[-3:]\n\"_\".join(df1.columns[86543].split(\"_\")[-3:])\n\nprojects = []\nfor column in df1.columns:\n    project = column.split(\"_\")[-3] \n    projects.append(project)<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">project_dict = Counter(projects)\nproject_df = pd.DataFrame({\"Project\" : project_dict.keys(),\n                           \"Number of columns\" : project_dict.values()})\n\nproject_df<\/code><\/pre>\n<\/div>\n<div  style=\"overflow-x: scroll;overflow-y: hidden;\">\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n        vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n        text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th>\n        <\/th>\n<th>\n          Project\n        <\/th>\n<th>\n          Number of columns\n        <\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>\n          0\n        <\/th>\n<td>\n          zh.wikipedia.org\n        <\/td>\n<td>\n          17229\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          1\n        <\/th>\n<td>\n          fr.wikipedia.org\n        <\/td>\n<td>\n          17802\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          2\n        <\/th>\n<td>\n          en.wikipedia.org\n        <\/td>\n<td>\n          24108\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          3\n        <\/th>\n<td>\n          commons.wikimedia.org\n        <\/td>\n<td>\n          10555\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          4\n        <\/th>\n<td>\n          ru.wikipedia.org\n        <\/td>\n<td>\n          15022\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          5\n        <\/th>\n<td>\n          www.mediawiki.org\n        <\/td>\n<td>\n          7300\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          6\n        <\/th>\n<td>\n          de.wikipedia.org\n        <\/td>\n<td>\n          18547\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          7\n        <\/th>\n<td>\n          ja.wikipedia.org\n        <\/td>\n<td>\n          20431\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          8\n        <\/th>\n<td>\n          es.wikipedia.org\n        <\/td>\n<td>\n          14069\n        <\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">def extract_average_views(project):\n    required_column_names = [column for column in df1.columns if project in column]\n    average_views = df1[required_column_names].sum().mean()\n    return average_views\n\naverage_views = []\nfor project in project_df[\"Project\"]:\n    average_views.append(extract_average_views(project))<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">project_df[\"Average views\"] = average_views\nproject_df['Average views'] = project_df['Average views'].astype('int64')\n<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">project_df<\/code><\/pre>\n<\/div>\n<div  style=\"overflow-x: scroll;overflow-y: hidden;\">\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n        vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n        text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th>\n        <\/th>\n<th>\n          Project\n        <\/th>\n<th>\n          Number of columns\n        <\/th>\n<th>\n          Average views\n        <\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>\n          0\n        <\/th>\n<td>\n          zh.wikipedia.org\n        <\/td>\n<td>\n          17229\n        <\/td>\n<td>\n          184107\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          1\n        <\/th>\n<td>\n          fr.wikipedia.org\n        <\/td>\n<td>\n          17802\n        <\/td>\n<td>\n          358264\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          2\n        <\/th>\n<td>\n          en.wikipedia.org\n        <\/td>\n<td>\n          24108\n        <\/td>\n<td>\n          2436898\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          3\n        <\/th>\n<td>\n          commons.wikimedia.org\n        <\/td>\n<td>\n          10555\n        <\/td>\n<td>\n          99429\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          4\n        <\/th>\n<td>\n          ru.wikipedia.org\n        <\/td>\n<td>\n          15022\n        <\/td>\n<td>\n          532443\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          5\n        <\/th>\n<td>\n          www.mediawiki.org\n        <\/td>\n<td>\n          7300\n        <\/td>\n<td>\n          31411\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          6\n        <\/th>\n<td>\n          de.wikipedia.org\n        <\/td>\n<td>\n          18547\n        <\/td>\n<td>\n          477813\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          7\n        <\/th>\n<td>\n          ja.wikipedia.org\n        <\/td>\n<td>\n          20431\n        <\/td>\n<td>\n          419523\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          8\n        <\/th>\n<td>\n          es.wikipedia.org\n        <\/td>\n<td>\n          14069\n        <\/td>\n<td>\n          674546\n        <\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">project_df_sorted = project_df.sort_values(by = \"Average views\", ascending = False)<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">plt.figure(figsize = (10,6))\nsns.barplot(x = project_df_sorted[\"Project\"], y = project_df_sorted[\"Average views\"])\nplt.xticks(rotation = \"vertical\")\nplt.title(\"Average views per each project\")\nplt.show()<\/code><\/pre>\n<\/div>\n<p><a href=\"https:\/\/griddb.net\/wp-content\/uploads\/2022\/11\/output_45_0.png\"><img decoding=\"async\" src=\"https:\/\/griddb.net\/wp-content\/uploads\/2022\/11\/output_45_0.png\" alt=\"\" width=\"708\" height=\"420\" class=\"aligncenter size-full wp-image-28931\" \/><\/a><\/p>\n<p>Popular pages in &#8220;en.wikipedia.org&#8221;<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">en_wikipedia_org_columns = [column for column in df1.columns if \"en.wikipedia.org\" in column]\n\ntop_pages_en = df1[en_wikipedia_org_columns].mean().sort_values(ascending = False)[0:5]\ndf1[top_pages_en.index].plot(figsize = (16,9))<\/code><\/pre>\n<\/div>\n<pre><code>&lt;AxesSubplot:xlabel='Date'&gt;\n<\/code><\/pre>\n<p><a href=\"https:\/\/griddb.net\/wp-content\/uploads\/2022\/11\/output_47_1.png\"><img decoding=\"async\" src=\"https:\/\/griddb.net\/wp-content\/uploads\/2022\/11\/output_47_1.png\" alt=\"\" width=\"708\" height=\"420\" class=\"aligncenter size-full wp-image-28931\" \/><\/a><\/p>\n<h2>5&#46; Forecasting<\/h2>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">train = df<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">train<\/code><\/pre>\n<\/div>\n<div  style=\"overflow-x: scroll;overflow-y: hidden;\">\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n        vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n        text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th>\n        <\/th>\n<th>\n          Page\n        <\/th>\n<th>\n          2015-07-01\n        <\/th>\n<th>\n          2015-07-02\n        <\/th>\n<th>\n          2015-07-03\n        <\/th>\n<th>\n          2015-07-04\n        <\/th>\n<th>\n          2015-07-05\n        <\/th>\n<th>\n          2015-07-06\n        <\/th>\n<th>\n          2015-07-07\n        <\/th>\n<th>\n          2015-07-08\n        <\/th>\n<th>\n          2015-07-09\n        <\/th>\n<th>\n          &#8230;\n        <\/th>\n<th>\n          2016-12-23\n        <\/th>\n<th>\n          2016-12-24\n        <\/th>\n<th>\n          2016-12-25\n        <\/th>\n<th>\n          2016-12-26\n        <\/th>\n<th>\n          2016-12-27\n        <\/th>\n<th>\n          2016-12-28\n        <\/th>\n<th>\n          2016-12-29\n        <\/th>\n<th>\n          2016-12-30\n        <\/th>\n<th>\n          2016-12-31\n        <\/th>\n<th>\n          lang\n        <\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>\n          0\n        <\/th>\n<td>\n          2NE1_zh.wikipedia.org_all-access_spider\n        <\/td>\n<td>\n          18.0\n        <\/td>\n<td>\n          11.0\n        <\/td>\n<td>\n          5.0\n        <\/td>\n<td>\n          13.0\n        <\/td>\n<td>\n          14.0\n        <\/td>\n<td>\n          9.0\n        <\/td>\n<td>\n          9.0\n        <\/td>\n<td>\n          22.0\n        <\/td>\n<td>\n          26.0\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          63.0\n        <\/td>\n<td>\n          15.0\n        <\/td>\n<td>\n          26.0\n        <\/td>\n<td>\n          14.0\n        <\/td>\n<td>\n          20.0\n        <\/td>\n<td>\n          22.0\n        <\/td>\n<td>\n          19.0\n        <\/td>\n<td>\n          18.0\n        <\/td>\n<td>\n          20.0\n        <\/td>\n<td>\n          zh\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          1\n        <\/th>\n<td>\n          2PM_zh.wikipedia.org_all-access_spider\n        <\/td>\n<td>\n          11.0\n        <\/td>\n<td>\n          14.0\n        <\/td>\n<td>\n          15.0\n        <\/td>\n<td>\n          18.0\n        <\/td>\n<td>\n          11.0\n        <\/td>\n<td>\n          13.0\n        <\/td>\n<td>\n          22.0\n        <\/td>\n<td>\n          11.0\n        <\/td>\n<td>\n          10.0\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          42.0\n        <\/td>\n<td>\n          28.0\n        <\/td>\n<td>\n          15.0\n        <\/td>\n<td>\n          9.0\n        <\/td>\n<td>\n          30.0\n        <\/td>\n<td>\n          52.0\n        <\/td>\n<td>\n          45.0\n        <\/td>\n<td>\n          26.0\n        <\/td>\n<td>\n          20.0\n        <\/td>\n<td>\n          zh\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          2\n        <\/th>\n<td>\n          3C_zh.wikipedia.org_all-access_spider\n        <\/td>\n<td>\n          1.0\n        <\/td>\n<td>\n          0.0\n        <\/td>\n<td>\n          1.0\n        <\/td>\n<td>\n          1.0\n        <\/td>\n<td>\n          0.0\n        <\/td>\n<td>\n          4.0\n        <\/td>\n<td>\n          0.0\n        <\/td>\n<td>\n          3.0\n        <\/td>\n<td>\n          4.0\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          1.0\n        <\/td>\n<td>\n          1.0\n        <\/td>\n<td>\n          7.0\n        <\/td>\n<td>\n          4.0\n        <\/td>\n<td>\n          4.0\n        <\/td>\n<td>\n          6.0\n        <\/td>\n<td>\n          3.0\n        <\/td>\n<td>\n          4.0\n        <\/td>\n<td>\n          17.0\n        <\/td>\n<td>\n          zh\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          3\n        <\/th>\n<td>\n          4minute_zh.wikipedia.org_all-access_spider\n        <\/td>\n<td>\n          35.0\n        <\/td>\n<td>\n          13.0\n        <\/td>\n<td>\n          10.0\n        <\/td>\n<td>\n          94.0\n        <\/td>\n<td>\n          4.0\n        <\/td>\n<td>\n          26.0\n        <\/td>\n<td>\n          14.0\n        <\/td>\n<td>\n          9.0\n        <\/td>\n<td>\n          11.0\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          10.0\n        <\/td>\n<td>\n          26.0\n        <\/td>\n<td>\n          27.0\n        <\/td>\n<td>\n          16.0\n        <\/td>\n<td>\n          11.0\n        <\/td>\n<td>\n          17.0\n        <\/td>\n<td>\n          19.0\n        <\/td>\n<td>\n          10.0\n        <\/td>\n<td>\n          11.0\n        <\/td>\n<td>\n          zh\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          4\n        <\/th>\n<td>\n          52_Hz_I_Love_You_zh.wikipedia.org_all-access_s&#8230;\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          9.0\n        <\/td>\n<td>\n          25.0\n        <\/td>\n<td>\n          13.0\n        <\/td>\n<td>\n          3.0\n        <\/td>\n<td>\n          11.0\n        <\/td>\n<td>\n          27.0\n        <\/td>\n<td>\n          13.0\n        <\/td>\n<td>\n          36.0\n        <\/td>\n<td>\n          10.0\n        <\/td>\n<td>\n          zh\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          &#8230;\n        <\/th>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          145058\n        <\/th>\n<td>\n          Underworld_(serie_de_pel\u00edculas)_es.wikipedia.o&#8230;\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          13.0\n        <\/td>\n<td>\n          12.0\n        <\/td>\n<td>\n          13.0\n        <\/td>\n<td>\n          3.0\n        <\/td>\n<td>\n          5.0\n        <\/td>\n<td>\n          10.0\n        <\/td>\n<td>\n          es\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          145059\n        <\/th>\n<td>\n          Resident_Evil:_Cap\u00edtulo_Final_es.wikipedia.org&#8230;\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          es\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          145060\n        <\/th>\n<td>\n          Enamor\u00e1ndome_de_Ram\u00f3n_es.wikipedia.org_all-acc&#8230;\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          es\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          145061\n        <\/th>\n<td>\n          Hasta_el_\u00faltimo_hombre_es.wikipedia.org_all-ac&#8230;\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          es\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          145062\n        <\/th>\n<td>\n          Francisco_el_matem\u00e1tico_(serie_de_televisi\u00f3n_d&#8230;\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<td>\n          es\n        <\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>\n    145063 rows \u00d7 552 columns\n  <\/p>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">train=pd.melt(df[list(df.columns[-50:])+['Page']], id_vars='Page', var_name='date', value_name='Visits')<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">list1 = ['lang']\ntrain = train[train.date.isin(list1) == False]<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">train<\/code><\/pre>\n<\/div>\n<div  style=\"overflow-x: scroll;overflow-y: hidden;\">\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n        vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n        text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th>\n        <\/th>\n<th>\n          Page\n        <\/th>\n<th>\n          date\n        <\/th>\n<th>\n          Visits\n        <\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>\n          0\n        <\/th>\n<td>\n          2NE1_zh.wikipedia.org_all-access_spider\n        <\/td>\n<td>\n          2016-11-13\n        <\/td>\n<td>\n          8.0\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          1\n        <\/th>\n<td>\n          2PM_zh.wikipedia.org_all-access_spider\n        <\/td>\n<td>\n          2016-11-13\n        <\/td>\n<td>\n          11.0\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          2\n        <\/th>\n<td>\n          3C_zh.wikipedia.org_all-access_spider\n        <\/td>\n<td>\n          2016-11-13\n        <\/td>\n<td>\n          4.0\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          3\n        <\/th>\n<td>\n          4minute_zh.wikipedia.org_all-access_spider\n        <\/td>\n<td>\n          2016-11-13\n        <\/td>\n<td>\n          13.0\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          4\n        <\/th>\n<td>\n          52_Hz_I_Love_You_zh.wikipedia.org_all-access_s&#8230;\n        <\/td>\n<td>\n          2016-11-13\n        <\/td>\n<td>\n          11.0\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          &#8230;\n        <\/th>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          7108082\n        <\/th>\n<td>\n          Underworld_(serie_de_pel\u00edculas)_es.wikipedia.o&#8230;\n        <\/td>\n<td>\n          2016-12-31\n        <\/td>\n<td>\n          10.0\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          7108083\n        <\/th>\n<td>\n          Resident_Evil:_Cap\u00edtulo_Final_es.wikipedia.org&#8230;\n        <\/td>\n<td>\n          2016-12-31\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          7108084\n        <\/th>\n<td>\n          Enamor\u00e1ndome_de_Ram\u00f3n_es.wikipedia.org_all-acc&#8230;\n        <\/td>\n<td>\n          2016-12-31\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          7108085\n        <\/th>\n<td>\n          Hasta_el_\u00faltimo_hombre_es.wikipedia.org_all-ac&#8230;\n        <\/td>\n<td>\n          2016-12-31\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          7108086\n        <\/th>\n<td>\n          Francisco_el_matem\u00e1tico_(serie_de_televisi\u00f3n_d&#8230;\n        <\/td>\n<td>\n          2016-12-31\n        <\/td>\n<td>\n          NaN\n        <\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>\n    7108087 rows \u00d7 3 columns\n  <\/p>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">train['date'] = train['date'].astype('datetime64[ns]')\ntrain['weekend'] = ((train.date.dt.dayofweek) \/\/ 5 == 1).astype(float)\nmedian = pd.DataFrame(train.groupby(['Page'])['Visits'].median())\nmedian.columns = ['median']\nmean = pd.DataFrame(train.groupby(['Page'])['Visits'].mean())\nmean.columns = ['mean']\n\ntrain = train.set_index('Page').join(mean).join(median)\ntrain.reset_index(drop=False,inplace=True)\ntrain['weekday'] = train['date'].apply(lambda x: x.weekday())\n\ntrain['year']=train.date.dt.year \ntrain['month']=train.date.dt.month \ntrain['day']=train.date.dt.day<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">mean_g = train[['Page','date','Visits']].groupby(['date'])['Visits'].mean()\n\nmeans =  pd.DataFrame(mean_g).reset_index(drop=False)\nmeans['weekday'] =means['date'].apply(lambda x: x.weekday())\n\nmeans['Date_str'] = means['date'].apply(lambda x: str(x))\n\n#create new columns year,month,day in the dataframe bysplitting the date string on hyphen and converting them to a list of values and add them under the column names year,month and day\nmeans[['year','month','day']] = pd.DataFrame(means['Date_str'].str.split('-',2).tolist(), columns = ['year','month','day'])\n\n#creating a new dataframe date by splitting the day column into 2 in the means data frame on sapce, to understand these steps look at the subsequent cells to understand how the day column looked before this step\ndate = pd.DataFrame(means['day'].str.split(' ',2).tolist(), columns = ['day','other'])\nmeans['day'] = date['day']*1<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">means.drop('Date_str',axis = 1, inplace =True)<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-python\">import seaborn as sns\nsns.set(font_scale=1) \n\ndate_index = means[['date','Visits']]\ndate_index = date_index.set_index('date')\n\nprophet = date_index.copy()\nprophet.reset_index(drop=False,inplace=True)\nprophet.columns = ['ds','y']\n\nm = Prophet()\nm.fit(prophet)\n\nfuture = m.make_future_dataframe(periods=30,freq='D')\nforecast = m.predict(future)\n\nfig = m.plot(forecast)<\/code><\/pre>\n<\/div>\n<pre><code>INFO:fbprophet:Disabling yearly seasonality. Run prophet with yearly_seasonality=True to override this.\nINFO:fbprophet:Disabling daily seasonality. Run prophet with daily_seasonality=True to override this.\n<\/code><\/pre>\n<p><a href=\"https:\/\/griddb.net\/wp-content\/uploads\/2022\/11\/output_57_1.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/griddb.net\/wp-content\/uploads\/2022\/11\/output_57_1.png\" alt=\"\" width=\"708\" height=\"420\" class=\"aligncenter size-full wp-image-28931\" srcset=\"\/wp-content\/uploads\/2022\/11\/output_57_1.png 708w, \/wp-content\/uploads\/2022\/11\/output_57_1-300x178.png 300w, \/wp-content\/uploads\/2022\/11\/output_57_1-600x356.png 600w\" sizes=\"(max-width: 708px) 100vw, 708px\" \/><\/a><\/p>\n<h2>6&#46; Conclusion<\/h2>\n<p>In this tutorial we analysed and forecasted web traffic using Python and GridDB. We examined two ways to import our data, using (1) GridDB and (2) Pandas. For large datasets, GridDB provides an excellent alternative to import data in your notebook as it is open-source and highly scalable.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In the last few years, the time-series database category has experienced the fastest growth. Both established and emerging technology sectors have been producing an increasing amount of time-series data. The quantity of sessions in a given period of time is known as web traffic, and it varies greatly depending on the time of day, day [&hellip;]<\/p>\n","protected":false},"author":41,"featured_media":27138,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[121],"tags":[],"class_list":["post-46736","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blog"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.1.1 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Studying and Forecasting Web Traffic using Python and GridDB | GridDB: Open Source Time Series Database for IoT<\/title>\n<meta name=\"description\" content=\"In the last few years, the time-series database category has experienced the fastest growth. Both established and emerging technology sectors have been\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.griddb.net\/en\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Studying and Forecasting Web Traffic using Python and GridDB | GridDB: Open Source Time Series Database for IoT\" \/>\n<meta property=\"og:description\" content=\"In the last few years, the time-series database category has experienced the fastest growth. Both established and emerging technology sectors have been\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.griddb.net\/en\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/\" \/>\n<meta property=\"og:site_name\" content=\"GridDB: Open Source Time Series Database for IoT\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/griddbcommunity\/\" \/>\n<meta property=\"article:published_time\" content=\"2022-11-30T08:00:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-11-13T20:56:21+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.griddb.net\/wp-content\/uploads\/2020\/12\/output_47_1.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1176\" \/>\n\t<meta property=\"og:image:height\" content=\"464\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"griddb-admin\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@GridDBCommunity\" \/>\n<meta name=\"twitter:site\" content=\"@GridDBCommunity\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"griddb-admin\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"12 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.griddb.net\/en\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.griddb.net\/en\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/\"},\"author\":{\"name\":\"griddb-admin\",\"@id\":\"https:\/\/www.griddb.net\/en\/#\/schema\/person\/4fe914ca9576878e82f5e8dd3ba52233\"},\"headline\":\"Studying and Forecasting Web Traffic using Python and GridDB\",\"datePublished\":\"2022-11-30T08:00:00+00:00\",\"dateModified\":\"2025-11-13T20:56:21+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.griddb.net\/en\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/\"},\"wordCount\":1496,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.griddb.net\/en\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.griddb.net\/en\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/#primaryimage\"},\"thumbnailUrl\":\"\/wp-content\/uploads\/2020\/12\/output_47_1.png\",\"articleSection\":[\"Blog\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.griddb.net\/en\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.griddb.net\/en\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/\",\"url\":\"https:\/\/www.griddb.net\/en\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/\",\"name\":\"Studying and Forecasting Web Traffic using Python and GridDB | GridDB: Open Source Time Series Database for IoT\",\"isPartOf\":{\"@id\":\"https:\/\/www.griddb.net\/en\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.griddb.net\/en\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.griddb.net\/en\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/#primaryimage\"},\"thumbnailUrl\":\"\/wp-content\/uploads\/2020\/12\/output_47_1.png\",\"datePublished\":\"2022-11-30T08:00:00+00:00\",\"dateModified\":\"2025-11-13T20:56:21+00:00\",\"description\":\"In the last few years, the time-series database category has experienced the fastest growth. Both established and emerging technology sectors have been\",\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.griddb.net\/en\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.griddb.net\/en\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/#primaryimage\",\"url\":\"\/wp-content\/uploads\/2020\/12\/output_47_1.png\",\"contentUrl\":\"\/wp-content\/uploads\/2020\/12\/output_47_1.png\",\"width\":1176,\"height\":464},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.griddb.net\/en\/#website\",\"url\":\"https:\/\/www.griddb.net\/en\/\",\"name\":\"GridDB: Open Source Time Series Database for IoT\",\"description\":\"GridDB is an open source time-series database with the performance of NoSQL and convenience of SQL\",\"publisher\":{\"@id\":\"https:\/\/www.griddb.net\/en\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.griddb.net\/en\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.griddb.net\/en\/#organization\",\"name\":\"Fixstars\",\"url\":\"https:\/\/www.griddb.net\/en\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.griddb.net\/en\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png\",\"contentUrl\":\"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png\",\"width\":200,\"height\":83,\"caption\":\"Fixstars\"},\"image\":{\"@id\":\"https:\/\/www.griddb.net\/en\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/griddbcommunity\/\",\"https:\/\/x.com\/GridDBCommunity\",\"https:\/\/www.linkedin.com\/company\/griddb-by-toshiba\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.griddb.net\/en\/#\/schema\/person\/4fe914ca9576878e82f5e8dd3ba52233\",\"name\":\"griddb-admin\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.griddb.net\/en\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/5bceca1cafc06886a7ba873e2f0a28011a1176c4dea59709f735b63ae30d0342?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/5bceca1cafc06886a7ba873e2f0a28011a1176c4dea59709f735b63ae30d0342?s=96&d=mm&r=g\",\"caption\":\"griddb-admin\"},\"url\":\"https:\/\/www.griddb.net\/en\/author\/griddb-admin\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Studying and Forecasting Web Traffic using Python and GridDB | GridDB: Open Source Time Series Database for IoT","description":"In the last few years, the time-series database category has experienced the fastest growth. Both established and emerging technology sectors have been","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.griddb.net\/en\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/","og_locale":"en_US","og_type":"article","og_title":"Studying and Forecasting Web Traffic using Python and GridDB | GridDB: Open Source Time Series Database for IoT","og_description":"In the last few years, the time-series database category has experienced the fastest growth. Both established and emerging technology sectors have been","og_url":"https:\/\/www.griddb.net\/en\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/","og_site_name":"GridDB: Open Source Time Series Database for IoT","article_publisher":"https:\/\/www.facebook.com\/griddbcommunity\/","article_published_time":"2022-11-30T08:00:00+00:00","article_modified_time":"2025-11-13T20:56:21+00:00","og_image":[{"width":1176,"height":464,"url":"https:\/\/www.griddb.net\/wp-content\/uploads\/2020\/12\/output_47_1.png","type":"image\/png"}],"author":"griddb-admin","twitter_card":"summary_large_image","twitter_creator":"@GridDBCommunity","twitter_site":"@GridDBCommunity","twitter_misc":{"Written by":"griddb-admin","Est. reading time":"12 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.griddb.net\/en\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/#article","isPartOf":{"@id":"https:\/\/www.griddb.net\/en\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/"},"author":{"name":"griddb-admin","@id":"https:\/\/www.griddb.net\/en\/#\/schema\/person\/4fe914ca9576878e82f5e8dd3ba52233"},"headline":"Studying and Forecasting Web Traffic using Python and GridDB","datePublished":"2022-11-30T08:00:00+00:00","dateModified":"2025-11-13T20:56:21+00:00","mainEntityOfPage":{"@id":"https:\/\/www.griddb.net\/en\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/"},"wordCount":1496,"commentCount":0,"publisher":{"@id":"https:\/\/www.griddb.net\/en\/#organization"},"image":{"@id":"https:\/\/www.griddb.net\/en\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/#primaryimage"},"thumbnailUrl":"\/wp-content\/uploads\/2020\/12\/output_47_1.png","articleSection":["Blog"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.griddb.net\/en\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.griddb.net\/en\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/","url":"https:\/\/www.griddb.net\/en\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/","name":"Studying and Forecasting Web Traffic using Python and GridDB | GridDB: Open Source Time Series Database for IoT","isPartOf":{"@id":"https:\/\/www.griddb.net\/en\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.griddb.net\/en\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/#primaryimage"},"image":{"@id":"https:\/\/www.griddb.net\/en\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/#primaryimage"},"thumbnailUrl":"\/wp-content\/uploads\/2020\/12\/output_47_1.png","datePublished":"2022-11-30T08:00:00+00:00","dateModified":"2025-11-13T20:56:21+00:00","description":"In the last few years, the time-series database category has experienced the fastest growth. Both established and emerging technology sectors have been","inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.griddb.net\/en\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.griddb.net\/en\/blog\/studying-and-forecasting-web-traffic-using-python-and-griddb\/#primaryimage","url":"\/wp-content\/uploads\/2020\/12\/output_47_1.png","contentUrl":"\/wp-content\/uploads\/2020\/12\/output_47_1.png","width":1176,"height":464},{"@type":"WebSite","@id":"https:\/\/www.griddb.net\/en\/#website","url":"https:\/\/www.griddb.net\/en\/","name":"GridDB: Open Source Time Series Database for IoT","description":"GridDB is an open source time-series database with the performance of NoSQL and convenience of SQL","publisher":{"@id":"https:\/\/www.griddb.net\/en\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.griddb.net\/en\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.griddb.net\/en\/#organization","name":"Fixstars","url":"https:\/\/www.griddb.net\/en\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.griddb.net\/en\/#\/schema\/logo\/image\/","url":"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png","contentUrl":"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png","width":200,"height":83,"caption":"Fixstars"},"image":{"@id":"https:\/\/www.griddb.net\/en\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/griddbcommunity\/","https:\/\/x.com\/GridDBCommunity","https:\/\/www.linkedin.com\/company\/griddb-by-toshiba"]},{"@type":"Person","@id":"https:\/\/www.griddb.net\/en\/#\/schema\/person\/4fe914ca9576878e82f5e8dd3ba52233","name":"griddb-admin","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.griddb.net\/en\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/5bceca1cafc06886a7ba873e2f0a28011a1176c4dea59709f735b63ae30d0342?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/5bceca1cafc06886a7ba873e2f0a28011a1176c4dea59709f735b63ae30d0342?s=96&d=mm&r=g","caption":"griddb-admin"},"url":"https:\/\/www.griddb.net\/en\/author\/griddb-admin\/"}]}},"_links":{"self":[{"href":"https:\/\/www.griddb.net\/en\/wp-json\/wp\/v2\/posts\/46736","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.griddb.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.griddb.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.griddb.net\/en\/wp-json\/wp\/v2\/users\/41"}],"replies":[{"embeddable":true,"href":"https:\/\/www.griddb.net\/en\/wp-json\/wp\/v2\/comments?post=46736"}],"version-history":[{"count":1,"href":"https:\/\/www.griddb.net\/en\/wp-json\/wp\/v2\/posts\/46736\/revisions"}],"predecessor-version":[{"id":51407,"href":"https:\/\/www.griddb.net\/en\/wp-json\/wp\/v2\/posts\/46736\/revisions\/51407"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.griddb.net\/en\/wp-json\/wp\/v2\/media\/27138"}],"wp:attachment":[{"href":"https:\/\/www.griddb.net\/en\/wp-json\/wp\/v2\/media?parent=46736"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.griddb.net\/en\/wp-json\/wp\/v2\/categories?post=46736"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.griddb.net\/en\/wp-json\/wp\/v2\/tags?post=46736"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}