{"id":46652,"date":"2021-06-24T00:00:00","date_gmt":"2021-06-24T07:00:00","guid":{"rendered":"https:\/\/www.griddb.net\/blog\/time-series-analysis-with-griddb-and-python\/"},"modified":"2025-11-13T12:55:25","modified_gmt":"2025-11-13T20:55:25","slug":"time-series-analysis-with-griddb-and-python","status":"publish","type":"post","link":"https:\/\/www.griddb.net\/en\/blog\/time-series-analysis-with-griddb-and-python\/","title":{"rendered":"Time Series Analysis with GridDB and Python"},"content":{"rendered":"<p>In this tutorial, we will see how to analyze time-series data stored in GridDB using Python. The outline of the tutorial is as follows &#8211;<\/p>\n<ol>\n<li>Loading the dataset using SQL and Pandas<\/li>\n<li>Preprocess the data to deal with null, missing values, etc. <\/li>\n<li>Build a classifier for our data<\/li>\n<\/ol>\n<h2>Prerequisites<\/h2>\n<p>This tutorial assumes prior installation of GridDB, Python3, and the associated libraries. If you have not installed any of the below packages, go ahead and do it before continuing with the tutorial. 1. <a href=\"https:\/\/griddb.net\/en\/\">GridDB<\/a> 2. <a href=\"https:\/\/www.python.org\/downloads\/\">Python 3<\/a> 3. <a href=\"https:\/\/github.com\/griddb\/python_client\">GridDB Python Client<\/a> 4. <a href=\"https:\/\/numpy.org\/\">NumPy<\/a> 5. <a href=\"https:\/\/pandas.pydata.org\/\">Pandas<\/a> 6. <a href=\"https:\/\/matplotlib.org\/\">Matplotlib<\/a> 7. <a href=\"https:\/\/scikit-learn.org\/stable\/\">Scikit-learn<\/a> 8. <a href=\"https:\/\/pypi.org\/project\/lightgbm\/\">Lightgbm<\/a> 9. <a href=\"https:\/\/seaborn.pydata.org\/#\">Seaborn<\/a><\/p>\n<p>The following tutorial is carried out in <a href=\"https:\/\/www.anaconda.com\/\">Jupyter notebooks (Anaconda Navigator)<\/a>. You can install these packages directly in your environment using <code>conda install package_name<\/code>. Alternatively, type <code>pip install package_name<\/code> in the command prompt\/terminal.<\/p>\n<h2>Importing the necessary libraries<\/h2>\n<p>Once you are done installing the required packages, let&#8217;s import these libraries.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">import numpy as np\nimport pandas as pd\nimport matplotlib.pyplot as plt\nfrom sklearn.preprocessing import *\nfrom sklearn.model_selection import *\nfrom sklearn.metrics import *\nimport os\nfrom datetime import datetime\nimport time\nfrom lightgbm import LGBMRegressor\nimport seaborn as sns\nfrom sklearn import metrics<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">APP_PATH = os.getcwd()\nAPP_PATH<\/code><\/pre>\n<\/div>\n<pre><code>'C:\\Users\\SHRIPRIYA\\Desktop\\AW Group\\GridDB'\n<\/code><\/pre>\n<h2>Loading the dataset<\/h2>\n<p>The time-series dataset we&#8217;re using for this tutorial has been open-sourced on <a href=\"https:\/\/www.kaggle.com\/vetrirah\/ml-iot\">Kaggle<\/a>. The zip folder contains two separate files for training and testing. However, since the test dataset does not contain the labels, we will not be able to verify our model&#8217;s performance. Therefore, will be using the training file as the whole dataset, and later on, we will split it into Train and Test sets.<\/p>\n<p>The training file has about 48000 rows (or instances) with 4 columns (or attributes) &#8211; <code>ID, DateTime, Junction, and Vehicles<\/code>. Column <code>Vehicle<\/code> is the dependent (or response) variable while <code>DateTime and Junction<\/code> are independent (or explanatory) variables<\/p>\n<h2>Using SQL<\/h2>\n<p>You can type the following statement in your Python script or console to retrieve the data from <a href=\"https:\/\/griddb.net\/\">GridDB<\/a>. The advantage of using GridDB&#8217;s <a href=\"https:\/\/github.com\/griddb\/python_client\">python-client<\/a> is that the resulting data type is a pandas dataframe. This makes data manipulation much easier.<\/p>\n<p><code>statement = ('SELECT * FROM train_ml_iot')<br \/>\ndataset = pd.read_sql_query(statement, cont)<\/code><\/p>\n<p>The output will look like &#8211;<\/p>\n<p><a href=\"https:\/\/griddb.net\/wp-content\/uploads\/2021\/06\/dataset_sql_query.png\"><img fetchpriority=\"high\" decoding=\"async\" src=\"https:\/\/griddb.net\/wp-content\/uploads\/2021\/06\/dataset_sql_query.png\" alt=\"\" width=\"433\" height=\"324\" class=\"aligncenter size-full wp-image-27594\" srcset=\"\/wp-content\/uploads\/2021\/06\/dataset_sql_query.png 433w, \/wp-content\/uploads\/2021\/06\/dataset_sql_query-300x224.png 300w\" sizes=\"(max-width: 433px) 100vw, 433px\" \/><\/a><\/p>\n<h2>Getting to know the dataset<\/h2>\n<p>Now that we have loaded our dataset, it is time to get a peek at how it looks. We can print out the first 5 rows using the <code>head<\/code> command. If you want to print more rows, simply pass a number to the function as an argument. For instance, <code>dataset.head(15)<\/code> will print out the first 15 rows. You can also use the <code>tail<\/code> command to get a gist of the dataset. The only difference is, as the name suggests, it prints the last 5 rows.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">dataset.head()<\/code><\/pre>\n<\/div>\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n        vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n        text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th>\n        <\/th>\n<th>\n          DateTime\n        <\/th>\n<th>\n          Junction\n        <\/th>\n<th>\n          Vehicles\n        <\/th>\n<th>\n          ID\n        <\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>\n          0\n        <\/th>\n<td>\n          2015-11-01 00:00:00\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          15\n        <\/td>\n<td>\n          20151101001\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          1\n        <\/th>\n<td>\n          2015-11-01 01:00:00\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          13\n        <\/td>\n<td>\n          20151101011\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          2\n        <\/th>\n<td>\n          2015-11-01 02:00:00\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          10\n        <\/td>\n<td>\n          20151101021\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          3\n        <\/th>\n<td>\n          2015-11-01 03:00:00\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          7\n        <\/td>\n<td>\n          20151101031\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          4\n        <\/th>\n<td>\n          2015-11-01 04:00:00\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          9\n        <\/td>\n<td>\n          20151101041\n        <\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">len(dataset)<\/code><\/pre>\n<\/div>\n<pre><code>48120\n<\/code><\/pre>\n<p><code>describe()<\/code> command is useful when dealing with numerical data. It basically prints out the whole summary of your data such as <code>min, max, average<\/code>, etc. We can use this information to know the range and scale of each attribute. There does not seem any anomaly from this level. Also, the scale of the attributes is not that different. This means we can skip the feature scaling step for this dataset.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">dataset.describe()<\/code><\/pre>\n<\/div>\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n        vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n        text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th>\n        <\/th>\n<th>\n          Junction\n        <\/th>\n<th>\n          Vehicles\n        <\/th>\n<th>\n          ID\n        <\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>\n          count\n        <\/th>\n<td>\n          48120.000000\n        <\/td>\n<td>\n          48120.000000\n        <\/td>\n<td>\n          4.812000e+04\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          mean\n        <\/th>\n<td>\n          2.180549\n        <\/td>\n<td>\n          22.791334\n        <\/td>\n<td>\n          2.016330e+10\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          std\n        <\/th>\n<td>\n          0.966955\n        <\/td>\n<td>\n          20.750063\n        <\/td>\n<td>\n          5.944854e+06\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          min\n        <\/th>\n<td>\n          1.000000\n        <\/td>\n<td>\n          1.000000\n        <\/td>\n<td>\n          2.015110e+10\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          25%\n        <\/th>\n<td>\n          1.000000\n        <\/td>\n<td>\n          9.000000\n        <\/td>\n<td>\n          2.016042e+10\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          50%\n        <\/th>\n<td>\n          2.000000\n        <\/td>\n<td>\n          15.000000\n        <\/td>\n<td>\n          2.016093e+10\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          75%\n        <\/th>\n<td>\n          3.000000\n        <\/td>\n<td>\n          29.000000\n        <\/td>\n<td>\n          2.017023e+10\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          max\n        <\/th>\n<td>\n          4.000000\n        <\/td>\n<td>\n          180.000000\n        <\/td>\n<td>\n          2.017063e+10\n        <\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<h2>Data Preprocessing<\/h2>\n<p>As mentioned above, the two attributes &#8211; <code>DateTime and Junction<\/code> are the independent variables and therefore, contribute to the outcome variable i.e.<code>Vehicles<\/code>. Therefore, keeping the <code>ID<\/code> attribute seems unnecessary. Let&#8217;s go ahead and drop it.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">dataset.drop([\"ID\"],axis = 1,inplace=True)<\/code><\/pre>\n<\/div>\n<p>Nobody likes redundant data. Let&#8217;s drop that too!<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">dataset.drop_duplicates(keep=\"first\", inplace=True)\nlen(dataset)<\/code><\/pre>\n<\/div>\n<pre><code>48120\n<\/code><\/pre>\n<p>Fortunately, the dataset did not have any duplicates, but it&#8217;s always a good practice to check for redundancy. Dealing with null values is also important when dealing with numerical data, especially. Null values make it difficult to perform mathematical operations and can also result in errors. So, you either replace null values with dummy data or drop those rows. Let&#8217;s first check if our data contains any null values.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">dataset.isnull().sum()<\/code><\/pre>\n<\/div>\n<pre><code>DateTime    0\nJunction    0\nVehicles    0\ndtype: int64\n<\/code><\/pre>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">dataset.dtypes<\/code><\/pre>\n<\/div>\n<pre><code>DateTime    object\nJunction     int64\nVehicles     int64\ndtype: object\n<\/code><\/pre>\n<p>The <code>DateTime<\/code> attribute has the datatype <code>object<\/code>. We will first call the pandas function <code>to_datetime<\/code> to convert this attribute to its actual format. This will allow us to extract the information about the <code>year, month, day<\/code>, etc, directly.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">dataset['DateTime'] = pd.to_datetime(dataset['DateTime'])<\/code><\/pre>\n<\/div>\n<p>Great! Now that our time is converted to a suitable format, let&#8217;s extract the following attributes &#8211; <code>Weekday, Year, Month, Day, Time, Week, and Quater<\/code><\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">dataset['Weekday'] = [date.weekday() for date in dataset.DateTime]\ndataset['Year'] = [date.year for date in dataset.DateTime]\ndataset['Month'] = [date.month for date in dataset.DateTime]\ndataset['Day'] = [date.day for date in dataset.DateTime]\ndataset['Time'] = [((date.hour*60+(date.minute))*60)+date.second for date in dataset.DateTime]\ndataset['Week'] = [date.week for date in dataset.DateTime]\ndataset['Quarter'] = [date.quarter for date in dataset.DateTime]<\/code><\/pre>\n<\/div>\n<p>The updated dataset looks like &#8211;<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">dataset.head()<\/code><\/pre>\n<\/div>\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n        vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n        text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th>\n        <\/th>\n<th>\n          DateTime\n        <\/th>\n<th>\n          Junction\n        <\/th>\n<th>\n          Vehicles\n        <\/th>\n<th>\n          Weekday\n        <\/th>\n<th>\n          Year\n        <\/th>\n<th>\n          Month\n        <\/th>\n<th>\n          Day\n        <\/th>\n<th>\n          Time\n        <\/th>\n<th>\n          Week\n        <\/th>\n<th>\n          Quarter\n        <\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>\n          0\n        <\/th>\n<td>\n          2015-11-01 00:00:00\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          15\n        <\/td>\n<td>\n          6\n        <\/td>\n<td>\n          2015\n        <\/td>\n<td>\n          11\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          44\n        <\/td>\n<td>\n          4\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          1\n        <\/th>\n<td>\n          2015-11-01 01:00:00\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          13\n        <\/td>\n<td>\n          6\n        <\/td>\n<td>\n          2015\n        <\/td>\n<td>\n          11\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          3600\n        <\/td>\n<td>\n          44\n        <\/td>\n<td>\n          4\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          2\n        <\/th>\n<td>\n          2015-11-01 02:00:00\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          10\n        <\/td>\n<td>\n          6\n        <\/td>\n<td>\n          2015\n        <\/td>\n<td>\n          11\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          7200\n        <\/td>\n<td>\n          44\n        <\/td>\n<td>\n          4\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          3\n        <\/th>\n<td>\n          2015-11-01 03:00:00\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          7\n        <\/td>\n<td>\n          6\n        <\/td>\n<td>\n          2015\n        <\/td>\n<td>\n          11\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          10800\n        <\/td>\n<td>\n          44\n        <\/td>\n<td>\n          4\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          4\n        <\/th>\n<td>\n          2015-11-01 04:00:00\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          9\n        <\/td>\n<td>\n          6\n        <\/td>\n<td>\n          2015\n        <\/td>\n<td>\n          11\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          14400\n        <\/td>\n<td>\n          44\n        <\/td>\n<td>\n          4\n        <\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">dataset.keys()<\/code><\/pre>\n<\/div>\n<pre><code>Index(['DateTime', 'Junction', 'Vehicles', 'Weekday', 'Year', 'Month', 'Day',\n       'Time', 'Week', 'Quarter'],\n      dtype='object')\n<\/code><\/pre>\n<h2>Visualizing the trend<\/h2>\n<p>Let&#8217;s see if there are any patterns our data is following.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">data = dataset.Vehicles\nbinwidth = 1\nplt.hist(data, bins=range(min(data), max(data) + binwidth, binwidth), log=False)\nplt.title(\"Gaussian Histogram\")\nplt.xlabel(\"Traffic\")\nplt.ylabel(\"Number of times\")\nplt.show()<\/code><\/pre>\n<\/div>\n<p><a href=\"https:\/\/griddb.net\/wp-content\/uploads\/2021\/06\/output1.png\"><img decoding=\"async\" src=\"https:\/\/griddb.net\/wp-content\/uploads\/2021\/06\/output1.png\" alt=\"\" width=\"375\" height=\"262\" class=\"aligncenter size-full wp-image-27592\" srcset=\"\/wp-content\/uploads\/2021\/06\/output1.png 375w, \/wp-content\/uploads\/2021\/06\/output1-300x210.png 300w\" sizes=\"(max-width: 375px) 100vw, 375px\" \/><\/a><\/p>\n<p>We can see that, more often than not, the traffic lies between <code>(20,30)<\/code> given a certain timestamp. That&#8217;s not too bad.<\/p>\n<h2>Preparing the dataset for Model Building<\/h2>\n<p>The <code>datetounix<\/code> function converts the <code>DateTime<\/code> attribute to <code>unixtime<\/code>. A <code>unix timestamp<\/code> is simply a number denoting the total time elapsed (in seconds) since the Unix Epoch. As its definition suggests, a <code>unix timestamp<\/code> is timezone independent which is why it is frequently used during Model Building.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">def datetounix(df):\n    unixtime = []\n    \n    # Running a loop for converting Date to seconds\n    for date in df['DateTime']:\n        unixtime.append(time.mktime(date.timetuple()))\n    \n    # Replacing Date with unixtime list\n    df['DateTime'] = unixtime\n    return(df)<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">dataset_features = datetounix(dataset)<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">dataset_features<\/code><\/pre>\n<\/div>\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n        vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n        text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th>\n        <\/th>\n<th>\n          DateTime\n        <\/th>\n<th>\n          Junction\n        <\/th>\n<th>\n          Vehicles\n        <\/th>\n<th>\n          Weekday\n        <\/th>\n<th>\n          Year\n        <\/th>\n<th>\n          Month\n        <\/th>\n<th>\n          Day\n        <\/th>\n<th>\n          Time\n        <\/th>\n<th>\n          Week\n        <\/th>\n<th>\n          Quarter\n        <\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>\n          0\n        <\/th>\n<td>\n          1.446316e+09\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          15\n        <\/td>\n<td>\n          6\n        <\/td>\n<td>\n          2015\n        <\/td>\n<td>\n          11\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          44\n        <\/td>\n<td>\n          4\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          1\n        <\/th>\n<td>\n          1.446320e+09\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          13\n        <\/td>\n<td>\n          6\n        <\/td>\n<td>\n          2015\n        <\/td>\n<td>\n          11\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          3600\n        <\/td>\n<td>\n          44\n        <\/td>\n<td>\n          4\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          2\n        <\/th>\n<td>\n          1.446323e+09\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          10\n        <\/td>\n<td>\n          6\n        <\/td>\n<td>\n          2015\n        <\/td>\n<td>\n          11\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          7200\n        <\/td>\n<td>\n          44\n        <\/td>\n<td>\n          4\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          3\n        <\/th>\n<td>\n          1.446327e+09\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          7\n        <\/td>\n<td>\n          6\n        <\/td>\n<td>\n          2015\n        <\/td>\n<td>\n          11\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          10800\n        <\/td>\n<td>\n          44\n        <\/td>\n<td>\n          4\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          4\n        <\/th>\n<td>\n          1.446331e+09\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          9\n        <\/td>\n<td>\n          6\n        <\/td>\n<td>\n          2015\n        <\/td>\n<td>\n          11\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          14400\n        <\/td>\n<td>\n          44\n        <\/td>\n<td>\n          4\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          &#8230;\n        <\/th>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          48115\n        <\/th>\n<td>\n          1.498829e+09\n        <\/td>\n<td>\n          4\n        <\/td>\n<td>\n          11\n        <\/td>\n<td>\n          4\n        <\/td>\n<td>\n          2017\n        <\/td>\n<td>\n          6\n        <\/td>\n<td>\n          30\n        <\/td>\n<td>\n          68400\n        <\/td>\n<td>\n          26\n        <\/td>\n<td>\n          2\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          48116\n        <\/th>\n<td>\n          1.498833e+09\n        <\/td>\n<td>\n          4\n        <\/td>\n<td>\n          30\n        <\/td>\n<td>\n          4\n        <\/td>\n<td>\n          2017\n        <\/td>\n<td>\n          6\n        <\/td>\n<td>\n          30\n        <\/td>\n<td>\n          72000\n        <\/td>\n<td>\n          26\n        <\/td>\n<td>\n          2\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          48117\n        <\/th>\n<td>\n          1.498837e+09\n        <\/td>\n<td>\n          4\n        <\/td>\n<td>\n          16\n        <\/td>\n<td>\n          4\n        <\/td>\n<td>\n          2017\n        <\/td>\n<td>\n          6\n        <\/td>\n<td>\n          30\n        <\/td>\n<td>\n          75600\n        <\/td>\n<td>\n          26\n        <\/td>\n<td>\n          2\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          48118\n        <\/th>\n<td>\n          1.498840e+09\n        <\/td>\n<td>\n          4\n        <\/td>\n<td>\n          22\n        <\/td>\n<td>\n          4\n        <\/td>\n<td>\n          2017\n        <\/td>\n<td>\n          6\n        <\/td>\n<td>\n          30\n        <\/td>\n<td>\n          79200\n        <\/td>\n<td>\n          26\n        <\/td>\n<td>\n          2\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          48119\n        <\/th>\n<td>\n          1.498844e+09\n        <\/td>\n<td>\n          4\n        <\/td>\n<td>\n          12\n        <\/td>\n<td>\n          4\n        <\/td>\n<td>\n          2017\n        <\/td>\n<td>\n          6\n        <\/td>\n<td>\n          30\n        <\/td>\n<td>\n          82800\n        <\/td>\n<td>\n          26\n        <\/td>\n<td>\n          2\n        <\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>\n    48120 rows \u00c3\u2014 10 columns\n  <\/p>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">X = dataset_features  <\/code><\/pre>\n<\/div>\n<p><code>Junction, Weekday, and Day<\/code> are discrete data &#8211; they are classes rather than a continuous value. Therefore, we need to encode this data before passing it to the classifier. For that, these data will need to be converted to <code>str<\/code>. Then, we will call the <code>get_dummies<\/code> function to get the encoded data.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">X['Junction'] = X['Junction'].astype('str')\nX['Weekday']  = X['Weekday'].astype('str')\nX['Day'] = X[ 'Day' ].astype('str')<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">X = pd.get_dummies(X)<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">print(\"X.shape : \", X.shape)\ndisplay(X.columns)<\/code><\/pre>\n<\/div>\n<pre><code>X.shape :  (48120, 49)\n\n\n\nIndex(['DateTime', 'Vehicles', 'Year', 'Month', 'Time', 'Week', 'Quarter',\n       'Junction_1', 'Junction_2', 'Junction_3', 'Junction_4', 'Weekday_0',\n       'Weekday_1', 'Weekday_2', 'Weekday_3', 'Weekday_4', 'Weekday_5',\n       'Weekday_6', 'Day_1', 'Day_10', 'Day_11', 'Day_12', 'Day_13', 'Day_14',\n       'Day_15', 'Day_16', 'Day_17', 'Day_18', 'Day_19', 'Day_2', 'Day_20',\n       'Day_21', 'Day_22', 'Day_23', 'Day_24', 'Day_25', 'Day_26', 'Day_27',\n       'Day_28', 'Day_29', 'Day_3', 'Day_30', 'Day_31', 'Day_4', 'Day_5',\n       'Day_6', 'Day_7', 'Day_8', 'Day_9'],\n      dtype='object')\n<\/code><\/pre>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">X.head()<\/code><\/pre>\n<\/div>\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n        vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n        text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th>\n        <\/th>\n<th>\n          DateTime\n        <\/th>\n<th>\n          Vehicles\n        <\/th>\n<th>\n          Year\n        <\/th>\n<th>\n          Month\n        <\/th>\n<th>\n          Time\n        <\/th>\n<th>\n          Week\n        <\/th>\n<th>\n          Quarter\n        <\/th>\n<th>\n          Junction_1\n        <\/th>\n<th>\n          Junction_2\n        <\/th>\n<th>\n          Junction_3\n        <\/th>\n<th>\n          &#8230;\n        <\/th>\n<th>\n          Day_29\n        <\/th>\n<th>\n          Day_3\n        <\/th>\n<th>\n          Day_30\n        <\/th>\n<th>\n          Day_31\n        <\/th>\n<th>\n          Day_4\n        <\/th>\n<th>\n          Day_5\n        <\/th>\n<th>\n          Day_6\n        <\/th>\n<th>\n          Day_7\n        <\/th>\n<th>\n          Day_8\n        <\/th>\n<th>\n          Day_9\n        <\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>\n          0\n        <\/th>\n<td>\n          1.446316e+09\n        <\/td>\n<td>\n          15\n        <\/td>\n<td>\n          2015\n        <\/td>\n<td>\n          11\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          44\n        <\/td>\n<td>\n          4\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          1\n        <\/th>\n<td>\n          1.446320e+09\n        <\/td>\n<td>\n          13\n        <\/td>\n<td>\n          2015\n        <\/td>\n<td>\n          11\n        <\/td>\n<td>\n          3600\n        <\/td>\n<td>\n          44\n        <\/td>\n<td>\n          4\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          2\n        <\/th>\n<td>\n          1.446323e+09\n        <\/td>\n<td>\n          10\n        <\/td>\n<td>\n          2015\n        <\/td>\n<td>\n          11\n        <\/td>\n<td>\n          7200\n        <\/td>\n<td>\n          44\n        <\/td>\n<td>\n          4\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          3\n        <\/th>\n<td>\n          1.446327e+09\n        <\/td>\n<td>\n          7\n        <\/td>\n<td>\n          2015\n        <\/td>\n<td>\n          11\n        <\/td>\n<td>\n          10800\n        <\/td>\n<td>\n          44\n        <\/td>\n<td>\n          4\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<\/tr>\n<tr>\n<th>\n          4\n        <\/th>\n<td>\n          1.446331e+09\n        <\/td>\n<td>\n          9\n        <\/td>\n<td>\n          2015\n        <\/td>\n<td>\n          11\n        <\/td>\n<td>\n          14400\n        <\/td>\n<td>\n          44\n        <\/td>\n<td>\n          4\n        <\/td>\n<td>\n          1\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          &#8230;\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<td>\n          0\n        <\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>\n    5 rows \u00c3\u2014 49 columns\n  <\/p>\n<\/div>\n<h2>Defining a classifier<\/h2>\n<p>We will be using a gradient boosting model &#8211; <code>LGBMRegressor<\/code>. More information on the model architecture and parameters can be found <a href=\"https:\/\/lightgbm.readthedocs.io\/en\/latest\/pythonapi\/lightgbm.LGBMRegressor.html\">here<\/a>.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">clf = LGBMRegressor(boosting_type='gbdt',\n                    max_depth=6,\n                    learning_rate=0.25, \n                    n_estimators=80, \n                    min_split_gain=0.7,\n                    reg_alpha=0.00001,\n                    random_state = 16\n                   )\n<\/code><\/pre>\n<\/div>\n<h2>Splitting the dataset<\/h2>\n<p>Splitting the dataset into train and test with a ratio of 70-30. You can customize this ratio as per your convenience. We have used the conventional one.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">Y = dataset['Vehicles'].to_frame()\ndataset = dataset.drop(['Vehicles'], axis=1)\nX_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.3, random_state=101)<\/code><\/pre>\n<\/div>\n<h2>Model Evaluation<\/h2>\n<p>Let&#8217;s see how our model performs on the test data.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">clf = clf.fit(X_train, y_train)<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">predictions = clf.predict(X_test)<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">print(\"RMSE\", np.sqrt(metrics.mean_squared_error(y_test, predictions)))<\/code><\/pre>\n<\/div>\n<pre><code>RMSE 0.309624242642493\n<\/code><\/pre>\n<div class=\"clipboard\">\n<pre><code class=\"language-py\">sns.regplot(y_test,predictions)<\/code><\/pre>\n<\/div>\n<pre><code>&lt;matplotlib.axes._subplots.AxesSubplot at 0x138722cdcd0&gt;\n<\/code><\/pre>\n<p><a href=\"https:\/\/griddb.net\/wp-content\/uploads\/2021\/06\/output2.png\"><img decoding=\"async\" src=\"https:\/\/griddb.net\/wp-content\/uploads\/2021\/06\/output2.png\" alt=\"\" width=\"395\" height=\"278\" class=\"aligncenter size-full wp-image-27593\" srcset=\"\/wp-content\/uploads\/2021\/06\/output2.png 395w, \/wp-content\/uploads\/2021\/06\/output2-300x211.png 300w\" sizes=\"(max-width: 395px) 100vw, 395px\" \/><\/a><\/p>\n<h2>Conclusion<\/h2>\n<p>The model resulted in an RMSE of <code>0.309<\/code> which is pretty decent. You could try experimenting with different evaluation metrics. The resulting line in the plot seems to fit the data instances accurately. Thus, we could be assured that the model is performing well.<\/p>\n<p>More information on Metrics and Scoring is available on the <a href=\"https:\/\/scikit-learn.org\/stable\/modules\/model_evaluation.html\">official website of scikit-learn<\/a>. Happy Coding!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this tutorial, we will see how to analyze time-series data stored in GridDB using Python. The outline of the tutorial is as follows &#8211; Loading the dataset using SQL and Pandas Preprocess the data to deal with null, missing values, etc. Build a classifier for our data Prerequisites This tutorial assumes prior installation of [&hellip;]<\/p>\n","protected":false},"author":41,"featured_media":26395,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[121],"tags":[],"class_list":["post-46652","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blog"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.1.1 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Time Series Analysis with GridDB and Python | GridDB: Open Source Time Series Database for IoT<\/title>\n<meta name=\"description\" content=\"In this tutorial, we will see how to analyze time-series data stored in GridDB using Python. The outline of the tutorial is as follows - Loading the\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/griddb.net\/en\/blog\/time-series-analysis-with-griddb-and-python\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Time Series Analysis with GridDB and Python | GridDB: Open Source Time Series Database for IoT\" \/>\n<meta property=\"og:description\" content=\"In this tutorial, we will see how to analyze time-series data stored in GridDB using Python. The outline of the tutorial is as follows - Loading the\" \/>\n<meta property=\"og:url\" content=\"https:\/\/griddb.net\/en\/blog\/time-series-analysis-with-griddb-and-python\/\" \/>\n<meta property=\"og:site_name\" content=\"GridDB: Open Source Time Series Database for IoT\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/griddbcommunity\/\" \/>\n<meta property=\"article:published_time\" content=\"2021-06-24T07:00:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-11-13T20:55:25+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.griddb.net\/wp-content\/uploads\/2020\/03\/macbook-graphs-charts_2560x1707.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"2560\" \/>\n\t<meta property=\"og:image:height\" content=\"1707\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"griddb-admin\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@GridDBCommunity\" \/>\n<meta name=\"twitter:site\" content=\"@GridDBCommunity\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"griddb-admin\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"7 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/griddb.net\/en\/blog\/time-series-analysis-with-griddb-and-python\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/griddb.net\/en\/blog\/time-series-analysis-with-griddb-and-python\/\"},\"author\":{\"name\":\"griddb-admin\",\"@id\":\"https:\/\/www.griddb.net\/en\/#\/schema\/person\/4fe914ca9576878e82f5e8dd3ba52233\"},\"headline\":\"Time Series Analysis with GridDB and Python\",\"datePublished\":\"2021-06-24T07:00:00+00:00\",\"dateModified\":\"2025-11-13T20:55:25+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/griddb.net\/en\/blog\/time-series-analysis-with-griddb-and-python\/\"},\"wordCount\":966,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.griddb.net\/en\/#organization\"},\"image\":{\"@id\":\"https:\/\/griddb.net\/en\/blog\/time-series-analysis-with-griddb-and-python\/#primaryimage\"},\"thumbnailUrl\":\"\/wp-content\/uploads\/2020\/03\/macbook-graphs-charts_2560x1707.jpg\",\"articleSection\":[\"Blog\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/griddb.net\/en\/blog\/time-series-analysis-with-griddb-and-python\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/griddb.net\/en\/blog\/time-series-analysis-with-griddb-and-python\/\",\"url\":\"https:\/\/griddb.net\/en\/blog\/time-series-analysis-with-griddb-and-python\/\",\"name\":\"Time Series Analysis with GridDB and Python | GridDB: Open Source Time Series Database for IoT\",\"isPartOf\":{\"@id\":\"https:\/\/www.griddb.net\/en\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/griddb.net\/en\/blog\/time-series-analysis-with-griddb-and-python\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/griddb.net\/en\/blog\/time-series-analysis-with-griddb-and-python\/#primaryimage\"},\"thumbnailUrl\":\"\/wp-content\/uploads\/2020\/03\/macbook-graphs-charts_2560x1707.jpg\",\"datePublished\":\"2021-06-24T07:00:00+00:00\",\"dateModified\":\"2025-11-13T20:55:25+00:00\",\"description\":\"In this tutorial, we will see how to analyze time-series data stored in GridDB using Python. The outline of the tutorial is as follows - Loading the\",\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/griddb.net\/en\/blog\/time-series-analysis-with-griddb-and-python\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/griddb.net\/en\/blog\/time-series-analysis-with-griddb-and-python\/#primaryimage\",\"url\":\"\/wp-content\/uploads\/2020\/03\/macbook-graphs-charts_2560x1707.jpg\",\"contentUrl\":\"\/wp-content\/uploads\/2020\/03\/macbook-graphs-charts_2560x1707.jpg\",\"width\":2560,\"height\":1707},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.griddb.net\/en\/#website\",\"url\":\"https:\/\/www.griddb.net\/en\/\",\"name\":\"GridDB: Open Source Time Series Database for IoT\",\"description\":\"GridDB is an open source time-series database with the performance of NoSQL and convenience of SQL\",\"publisher\":{\"@id\":\"https:\/\/www.griddb.net\/en\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.griddb.net\/en\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.griddb.net\/en\/#organization\",\"name\":\"Fixstars\",\"url\":\"https:\/\/www.griddb.net\/en\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.griddb.net\/en\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png\",\"contentUrl\":\"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png\",\"width\":200,\"height\":83,\"caption\":\"Fixstars\"},\"image\":{\"@id\":\"https:\/\/www.griddb.net\/en\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/griddbcommunity\/\",\"https:\/\/x.com\/GridDBCommunity\",\"https:\/\/www.linkedin.com\/company\/griddb-by-toshiba\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.griddb.net\/en\/#\/schema\/person\/4fe914ca9576878e82f5e8dd3ba52233\",\"name\":\"griddb-admin\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.griddb.net\/en\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/5bceca1cafc06886a7ba873e2f0a28011a1176c4dea59709f735b63ae30d0342?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/5bceca1cafc06886a7ba873e2f0a28011a1176c4dea59709f735b63ae30d0342?s=96&d=mm&r=g\",\"caption\":\"griddb-admin\"},\"url\":\"https:\/\/www.griddb.net\/en\/author\/griddb-admin\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Time Series Analysis with GridDB and Python | GridDB: Open Source Time Series Database for IoT","description":"In this tutorial, we will see how to analyze time-series data stored in GridDB using Python. The outline of the tutorial is as follows - Loading the","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/griddb.net\/en\/blog\/time-series-analysis-with-griddb-and-python\/","og_locale":"en_US","og_type":"article","og_title":"Time Series Analysis with GridDB and Python | GridDB: Open Source Time Series Database for IoT","og_description":"In this tutorial, we will see how to analyze time-series data stored in GridDB using Python. The outline of the tutorial is as follows - Loading the","og_url":"https:\/\/griddb.net\/en\/blog\/time-series-analysis-with-griddb-and-python\/","og_site_name":"GridDB: Open Source Time Series Database for IoT","article_publisher":"https:\/\/www.facebook.com\/griddbcommunity\/","article_published_time":"2021-06-24T07:00:00+00:00","article_modified_time":"2025-11-13T20:55:25+00:00","og_image":[{"width":2560,"height":1707,"url":"https:\/\/www.griddb.net\/wp-content\/uploads\/2020\/03\/macbook-graphs-charts_2560x1707.jpg","type":"image\/jpeg"}],"author":"griddb-admin","twitter_card":"summary_large_image","twitter_creator":"@GridDBCommunity","twitter_site":"@GridDBCommunity","twitter_misc":{"Written by":"griddb-admin","Est. reading time":"7 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/griddb.net\/en\/blog\/time-series-analysis-with-griddb-and-python\/#article","isPartOf":{"@id":"https:\/\/griddb.net\/en\/blog\/time-series-analysis-with-griddb-and-python\/"},"author":{"name":"griddb-admin","@id":"https:\/\/www.griddb.net\/en\/#\/schema\/person\/4fe914ca9576878e82f5e8dd3ba52233"},"headline":"Time Series Analysis with GridDB and Python","datePublished":"2021-06-24T07:00:00+00:00","dateModified":"2025-11-13T20:55:25+00:00","mainEntityOfPage":{"@id":"https:\/\/griddb.net\/en\/blog\/time-series-analysis-with-griddb-and-python\/"},"wordCount":966,"commentCount":0,"publisher":{"@id":"https:\/\/www.griddb.net\/en\/#organization"},"image":{"@id":"https:\/\/griddb.net\/en\/blog\/time-series-analysis-with-griddb-and-python\/#primaryimage"},"thumbnailUrl":"\/wp-content\/uploads\/2020\/03\/macbook-graphs-charts_2560x1707.jpg","articleSection":["Blog"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/griddb.net\/en\/blog\/time-series-analysis-with-griddb-and-python\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/griddb.net\/en\/blog\/time-series-analysis-with-griddb-and-python\/","url":"https:\/\/griddb.net\/en\/blog\/time-series-analysis-with-griddb-and-python\/","name":"Time Series Analysis with GridDB and Python | GridDB: Open Source Time Series Database for IoT","isPartOf":{"@id":"https:\/\/www.griddb.net\/en\/#website"},"primaryImageOfPage":{"@id":"https:\/\/griddb.net\/en\/blog\/time-series-analysis-with-griddb-and-python\/#primaryimage"},"image":{"@id":"https:\/\/griddb.net\/en\/blog\/time-series-analysis-with-griddb-and-python\/#primaryimage"},"thumbnailUrl":"\/wp-content\/uploads\/2020\/03\/macbook-graphs-charts_2560x1707.jpg","datePublished":"2021-06-24T07:00:00+00:00","dateModified":"2025-11-13T20:55:25+00:00","description":"In this tutorial, we will see how to analyze time-series data stored in GridDB using Python. The outline of the tutorial is as follows - Loading the","inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/griddb.net\/en\/blog\/time-series-analysis-with-griddb-and-python\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/griddb.net\/en\/blog\/time-series-analysis-with-griddb-and-python\/#primaryimage","url":"\/wp-content\/uploads\/2020\/03\/macbook-graphs-charts_2560x1707.jpg","contentUrl":"\/wp-content\/uploads\/2020\/03\/macbook-graphs-charts_2560x1707.jpg","width":2560,"height":1707},{"@type":"WebSite","@id":"https:\/\/www.griddb.net\/en\/#website","url":"https:\/\/www.griddb.net\/en\/","name":"GridDB: Open Source Time Series Database for IoT","description":"GridDB is an open source time-series database with the performance of NoSQL and convenience of SQL","publisher":{"@id":"https:\/\/www.griddb.net\/en\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.griddb.net\/en\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.griddb.net\/en\/#organization","name":"Fixstars","url":"https:\/\/www.griddb.net\/en\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.griddb.net\/en\/#\/schema\/logo\/image\/","url":"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png","contentUrl":"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png","width":200,"height":83,"caption":"Fixstars"},"image":{"@id":"https:\/\/www.griddb.net\/en\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/griddbcommunity\/","https:\/\/x.com\/GridDBCommunity","https:\/\/www.linkedin.com\/company\/griddb-by-toshiba"]},{"@type":"Person","@id":"https:\/\/www.griddb.net\/en\/#\/schema\/person\/4fe914ca9576878e82f5e8dd3ba52233","name":"griddb-admin","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.griddb.net\/en\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/5bceca1cafc06886a7ba873e2f0a28011a1176c4dea59709f735b63ae30d0342?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/5bceca1cafc06886a7ba873e2f0a28011a1176c4dea59709f735b63ae30d0342?s=96&d=mm&r=g","caption":"griddb-admin"},"url":"https:\/\/www.griddb.net\/en\/author\/griddb-admin\/"}]}},"_links":{"self":[{"href":"https:\/\/www.griddb.net\/en\/wp-json\/wp\/v2\/posts\/46652","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.griddb.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.griddb.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.griddb.net\/en\/wp-json\/wp\/v2\/users\/41"}],"replies":[{"embeddable":true,"href":"https:\/\/www.griddb.net\/en\/wp-json\/wp\/v2\/comments?post=46652"}],"version-history":[{"count":1,"href":"https:\/\/www.griddb.net\/en\/wp-json\/wp\/v2\/posts\/46652\/revisions"}],"predecessor-version":[{"id":51327,"href":"https:\/\/www.griddb.net\/en\/wp-json\/wp\/v2\/posts\/46652\/revisions\/51327"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.griddb.net\/en\/wp-json\/wp\/v2\/media\/26395"}],"wp:attachment":[{"href":"https:\/\/www.griddb.net\/en\/wp-json\/wp\/v2\/media?parent=46652"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.griddb.net\/en\/wp-json\/wp\/v2\/categories?post=46652"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.griddb.net\/en\/wp-json\/wp\/v2\/tags?post=46652"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}