{"id":46625,"date":"2020-12-18T00:00:00","date_gmt":"2020-12-18T08:00:00","guid":{"rendered":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/"},"modified":"2025-11-13T12:55:07","modified_gmt":"2025-11-13T20:55:07","slug":"making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook","status":"publish","type":"post","link":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/","title":{"rendered":"Making Sense Out of Smart Metering Data With GridDB and Jupyter Notebook"},"content":{"rendered":"<h2 id=\"introduction-and-purpose\">Introduction and Purpose<\/h2>\n<h3 id=\"the-use-case\">The Use Case<\/h3>\n<p>Imagine that you are the owner of a building complex. You have installed smart meters all over the place to monitor power usage. Each of the devices creates a timestamp two times in a minute. It saves power usage data in kW and timestamps in the form of epoch seconds. The IoT provider sends you a .CSV file every month. <\/p>\n<h3 id=\"objectives\">Objectives<\/h3>\n<p>You want to propose an energy-saving plan. To start, you&#39;d like to identify <strong>patterns in power usage<\/strong>: how the power is consumed, when less of it is consumed, and when power usage is particularly extensive. For this purpose, you will build different data visualizations that help to catch patterns in time series data without applying such advanced methods as regression analysis.<\/p>\n<h3 id=\"methods\">Methods<\/h3>\n<p>The article will explain how to save the extracted raw data into <strong>GridDB<\/strong>, and then visualize it in Jupyter Notebook. The latter is very convenient for exploring trends and patterns in the data and for writing reports. In Jupyter, you have cells for code chunks, and you have cells for simple text where you can write down your thoughts, notes, etc. Jupyter Notebook is a powerful tool for data scientists since it allows loading huge amounts of data into it and analyzing them easily.<\/p>\n<p>Jupyter Notebook works with all Python packages. <\/p>\n<h3 id=\"prerequisites\">Prerequisites<\/h3>\n<p>The present tutorial is based upon a previous GridDB blog post that shows how to <a\n    href=\"https:\/\/griddb.net\/en\/blog\/using-python-to-interface-with-griddb-via-jdbc-with-jaydebeapi\/\">install Jupyter and access GridDB from it<\/a>.<\/p>\n<p>To start with the tutorial, you need to finish both aforementioned installations first.<\/p>\n<h2 id=\"saving-data-into-the-database\">Saving Data Into the Database<\/h2>\n<p>We will build the database from a publicly available IoT dataset. It contains <a\n    href=\"https:\/\/combed.github.io\">smart metering data on power usage in a university campus<\/a>. A one-month data with a total of 86192 is sufficient to illustrate the tutorial steps.<br \/>\nLet&#39;s get our hands dirty.<\/p>\n<h3 id=\"access-the-database\">Access the Database<\/h3>\n<p>First of all, we access this database using JayDeBeApi Python package. <\/p>\n<div class=\"clipboard\">\n<pre><code class=\"lang-python\"><span class=\"hljs-keyword\">import<\/span> jaydebeapi\n\nconn = jaydebeapi.<span class=\"hljs-built_in\">connect<\/span>(<span class=\"hljs-string\">\"com.toshiba.mwcloud.gs.sql.Driver\"<\/span>,\n                           <span class=\"hljs-string\">\"jdbc:gs:\/\/griddb:20001\/defaultCluster\/public?notificationMember:127.0.0.1:20001\"<\/span>,\n                           [<span class=\"hljs-string\">\"admin\"<\/span>, <span class=\"hljs-string\">\"admin\"<\/span>],\n                          <span class=\"hljs-string\">\"\/usr\/share\/java\/gridstore-jdbc-4.5.0.jar\"<\/span>,)\ncurs = conn.<span class=\"hljs-built_in\">cursor<\/span>()\n<\/code><\/pre>\n<\/div>\n<h2 id=\"create-a-table\">Create a Table<\/h2>\n<p>We have established a connection to the database and can insert the data. The queries can be written in SQL. We send them to the database using <strong>curs.execute()<\/strong>. The first query to start is the one that creates the data table.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"lang-python\">curs.execute('<span class=\"hljs-keyword\">CREATE<\/span> <span class=\"hljs-keyword\">TABLE<\/span> power2(<span class=\"hljs-keyword\">timestamp<\/span> <span class=\"hljs-built_in\">INTEGER<\/span>, <span class=\"hljs-keyword\">power<\/span> <span class=\"hljs-built_in\">FLOAT<\/span>)<span class=\"hljs-string\">')<\/span>\n<\/code><\/pre>\n<\/div>\n<h2 id=\"inserting-the-data\">Inserting the Data<\/h2>\n<p>We load the data from the .CSV into the database. We can do it directly via Jupyter, by saving the data into a data frame, and then sending it to GriDB.  If you get new data regularly, you can repeat this step to add new data to the database, or automate its execution.<\/p>\n<p>Making short checks by calling the <strong>info() or head()<\/strong> function is a good habit and helps to spare a lot of headaches. If the data has gaps or looks weird, you notice it in the very beginning.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"lang-python\"><span class=\"hljs-keyword\">import<\/span> pandas as pd\n\n<span class=\"hljs-keyword\">power<\/span> = pd.read_csv(<span class=\"hljs-string\">'Power.csv'<\/span>, names = [<span class=\"hljs-string\">'timestamp'<\/span>, <span class=\"hljs-string\">'power'<\/span>])\n<span class=\"hljs-keyword\">power<\/span>.info()\n<\/code><\/pre>\n<\/div>\n<pre><code>&lt;class 'pandas.core.frame.DataFrame'&gt;\nRangeIndex:<span class=\"hljs-number\"> 86192 <\/span>entries,<span class=\"hljs-number\"> 0 <\/span>to 86191\nData columns (total<span class=\"hljs-number\"> 2 <\/span>columns):\n <span class=\"hljs-comment\">#   Column     Non-Null Count  Dtype  <\/span>\n---  ------     --------------  -----  \n<span class=\"hljs-number\"> 0 <\/span>  timestamp <span class=\"hljs-number\"> 86192 <\/span>non-null  float64\n<span class=\"hljs-number\"> 1 <\/span>  power     <span class=\"hljs-number\"> 86192 <\/span>non-null  float64\ndtypes: float64(2)\nmemory usage: 1.3 MB\n<\/code><\/pre>\n<div class=\"clipboard\">\n<pre><code class=\"lang-python\">pd<span class=\"hljs-selector-class\">.options<\/span><span class=\"hljs-selector-class\">.display<\/span><span class=\"hljs-selector-class\">.float_format<\/span> = <span class=\"hljs-string\">'{:.2f}'<\/span><span class=\"hljs-selector-class\">.format<\/span>\npower.head()\n<\/code><\/pre>\n<\/div>\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n      vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n      vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n      text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th><\/th>\n<th>timestamp<\/th>\n<th>power<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>0<\/th>\n<td>1401595223<\/td>\n<td>0.00<\/td>\n<\/tr>\n<tr>\n<th>1<\/th>\n<td>1401595253<\/td>\n<td>0.00<\/td>\n<\/tr>\n<tr>\n<th>2<\/th>\n<td>1401595283<\/td>\n<td>0.00<\/td>\n<\/tr>\n<tr>\n<th>3<\/th>\n<td>1401595313<\/td>\n<td>0.00<\/td>\n<\/tr>\n<tr>\n<th>4<\/th>\n<td>1401595343<\/td>\n<td>0.00<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p>The timestamp column does not make sense so far. It has three zeros at the end: maybe, it was supposed to track milliseconds in addition to seconds, but the smart meter failed to. The timestamp is too long to be written to the database now. Besides, as a float number, it has little sense: epoch seconds can only be integers.<\/p>\n<p>We will get rid of the first two problems. <\/p>\n<p>If you will try this tutorial on a different dataset and won&#39;t meet the same problem, just skip this step. The point is, you need <strong>the timestamps to be in the integer format before writing it to the database<\/strong>. <\/p>\n<div class=\"clipboard\">\n<pre><code class=\"lang-python\"><span class=\"hljs-built-in\">power<\/span>[<span class=\"hljs-string\">'timestamp'<\/span>] = (<span class=\"hljs-built-in\">power<\/span>[<span class=\"hljs-string\">'timestamp'<\/span>]\/<span class=\"hljs-number\">1000<\/span>).astype(int)\n<span class=\"hljs-built-in\">power<\/span>.info()\n<\/code><\/pre>\n<\/div>\n<pre><code>&lt;class 'pandas.core.frame.DataFrame'&gt;\nRangeIndex:<span class=\"hljs-number\"> 86192 <\/span>entries,<span class=\"hljs-number\"> 0 <\/span>to 86191\nData columns (total<span class=\"hljs-number\"> 2 <\/span>columns):\n <span class=\"hljs-comment\">#   Column     Non-Null Count  Dtype  <\/span>\n---  ------     --------------  -----  \n<span class=\"hljs-number\"> 0 <\/span>  timestamp <span class=\"hljs-number\"> 86192 <\/span>non-null  int64  \n<span class=\"hljs-number\"> 1 <\/span>  power     <span class=\"hljs-number\"> 86192 <\/span>non-null  float64\ndtypes: float64(1), int64(1)\nmemory usage: 1.3 MB\n<\/code><\/pre>\n<p>The data looks good, and both types coincide with the data types in the columns that we created in our table. We are<br \/>\n  ready to feed the data from the data frame into the database.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"lang-python\"><span class=\"hljs-keyword\">for<\/span> row <span class=\"hljs-keyword\">in<\/span> power.itertuples():\n    curs.execute(<span class=\"hljs-string\">'''\n                INSERT INTO power2 (timestamp, power)\n                VALUES (?,?)\n                '''<\/span>,\n                (row.timestamp, \n                row.power)\n                )\nconn.commit()\n<\/code><\/pre>\n<\/div>\n<p>After we have waited patiently for the data to be inserted, we can perform another small check and look at what we get back from the database.<\/p>\n<p>Starting from now, we communicate with the database table using <strong>read_sql_query()<\/strong> function from the pandas package. This function fetches the data and converts it into a pandas data frame.<\/p>\n<h2 id=\"check-the-success\">Check the Success<\/h2>\n<div class=\"clipboard\">\n<pre><code class=\"lang-python\">sql = ('<span class=\"hljs-keyword\">SELECT<\/span> <span class=\"hljs-keyword\">power<\/span>, <span class=\"hljs-keyword\">timestamp<\/span> <span class=\"hljs-keyword\">FROM<\/span> power2 <span class=\"hljs-keyword\">LiMIT<\/span> <span class=\"hljs-number\">1<\/span><span class=\"hljs-string\">')\nsql_query = pd.read_sql_query(sql, conn)\nsql_query<\/span>\n<\/code><\/pre>\n<\/div>\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n      vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n      vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n      text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th><\/th>\n<th>power<\/th>\n<th>timestamp<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>0<\/th>\n<td>0.00<\/td>\n<td>1401595223<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p>The data looks like expected, so we proceed to load all rows.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"lang-python\">sql = ('<span class=\"hljs-keyword\">SELECT<\/span> <span class=\"hljs-keyword\">power<\/span>, <span class=\"hljs-keyword\">timestamp<\/span> <span class=\"hljs-keyword\">FROM<\/span> power2<span class=\"hljs-string\">')\nsql_query = pd.read_sql_query(sql, conn)\nsql_query.info()<\/span>\n<\/code><\/pre>\n<\/div>\n<pre><code>&lt;class 'pandas.core.frame.DataFrame'&gt;\nRangeIndex:<span class=\"hljs-number\"> 86192 <\/span>entries,<span class=\"hljs-number\"> 0 <\/span>to 86191\nData columns (total<span class=\"hljs-number\"> 2 <\/span>columns):\n <span class=\"hljs-comment\">#   Column     Non-Null Count  Dtype  <\/span>\n---  ------     --------------  -----  \n<span class=\"hljs-number\"> 0 <\/span>  power     <span class=\"hljs-number\"> 86192 <\/span>non-null  float64\n<span class=\"hljs-number\"> 1 <\/span>  timestamp <span class=\"hljs-number\"> 86192 <\/span>non-null  int64  \ndtypes: float64(1), int64(1)\nmemory usage: 1.3 MB\n<\/code><\/pre>\n<p>Both columns have data in float format. We will create three different data frames out of that one to catch patterns in power usage. Next stop &#8211; data preparation!<\/p>\n<h2 id=\"getting-first-insights\">Getting First Insights<\/h2>\n<p>We will begin with the simplest data visualization possible: a line chart. Before making anything complicated, it isuseful to have a broader look over the data. This will guide us to choosing the further steps, for instance, if we want to zoom into the data, or focus on some specific dimension (although we have none so far), or we are going to search for some patterns.<\/p>\n<h3 id=\"small-preparations\">Small Preparations<\/h3>\n<p>To be able to plot the data, we have to convert the epoch seconds into a datetime format. We make a string out of it and then apply the to_datetime() function.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"lang-python\">sql_query[<span class=\"hljs-string\">'timestamp2'<\/span>] = sql_query[<span class=\"hljs-string\">'timestamp'<\/span>].apply<span class=\"hljs-comment\">(str)<\/span>\nsql_query.head<span class=\"hljs-comment\">()<\/span>\n<\/code><\/pre>\n<\/div>\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n      vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n      vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n      text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th><\/th>\n<th>power<\/th>\n<th>timestamp<\/th>\n<th>timestamp2<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>0<\/th>\n<td>0.00<\/td>\n<td>1401595223<\/td>\n<td>1401595223<\/td>\n<\/tr>\n<tr>\n<th>1<\/th>\n<td>0.00<\/td>\n<td>1401595253<\/td>\n<td>1401595253<\/td>\n<\/tr>\n<tr>\n<th>2<\/th>\n<td>0.00<\/td>\n<td>1401595283<\/td>\n<td>1401595283<\/td>\n<\/tr>\n<tr>\n<th>3<\/th>\n<td>0.00<\/td>\n<td>1401595313<\/td>\n<td>1401595313<\/td>\n<\/tr>\n<tr>\n<th>4<\/th>\n<td>0.00<\/td>\n<td>1401595343<\/td>\n<td>1401595343<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"lang-python\">sql_query[<span class=\"hljs-string\">'datetime'<\/span>] = sql_query[<span class=\"hljs-string\">'timestamp2'<\/span>].apply(lambda x: pd.to_datetime(x, unit=<span class=\"hljs-string\">'s'<\/span>))\n<\/code><\/pre>\n<pre><code class=\"lang-python\"><span class=\"hljs-selector-tag\">sql_query<\/span><span class=\"hljs-selector-class\">.head<\/span>()\n<\/code><\/pre>\n<\/div>\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n      vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n      vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n      text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th><\/th>\n<th>power<\/th>\n<th>timestamp<\/th>\n<th>timestamp2<\/th>\n<th>datetime<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>0<\/th>\n<td>0.00<\/td>\n<td>1401595223<\/td>\n<td>1401595223<\/td>\n<td>2014-06-01 04:00:23<\/td>\n<\/tr>\n<tr>\n<th>1<\/th>\n<td>0.00<\/td>\n<td>1401595253<\/td>\n<td>1401595253<\/td>\n<td>2014-06-01 04:00:53<\/td>\n<\/tr>\n<tr>\n<th>2<\/th>\n<td>0.00<\/td>\n<td>1401595283<\/td>\n<td>1401595283<\/td>\n<td>2014-06-01 04:01:23<\/td>\n<\/tr>\n<tr>\n<th>3<\/th>\n<td>0.00<\/td>\n<td>1401595313<\/td>\n<td>1401595313<\/td>\n<td>2014-06-01 04:01:53<\/td>\n<\/tr>\n<tr>\n<th>4<\/th>\n<td>0.00<\/td>\n<td>1401595343<\/td>\n<td>1401595343<\/td>\n<td>2014-06-01 04:02:23<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p>Now we are set up to produce the first chart.<\/p>\n<h2 id=\"line-chart-and-first-conclusions\">Line Chart and First Conclusions<\/h2>\n<pre><code class=\"lang-python\">%matplotlib <span class=\"hljs-keyword\">inline<\/span>\n\n<span class=\"hljs-keyword\">import<\/span> matplotlib.pyplot <span class=\"hljs-keyword\">as<\/span> plt\n\nplt.rcParams[<span class=\"hljs-string\">'figure.figsize'<\/span>] = (<span class=\"hljs-number\">20<\/span>,<span class=\"hljs-number\">8<\/span>) # Change the plot size\n\nsql_query.plot(y = <span class=\"hljs-string\">'power'<\/span>, x = <span class=\"hljs-string\">'datetime'<\/span>, grid=<span class=\"hljs-literal\">True<\/span>)\n<\/code><\/pre>\n<pre><code>&lt;<span class=\"hljs-string\">AxesSubplot:<\/span>xlabel=<span class=\"hljs-string\">'datetime'<\/span>&gt;\n<\/code><\/pre>\n<p><a href=\"https:\/\/griddb.net\/wp-content\/uploads\/2020\/12\/output_40_1.png\"><img fetchpriority=\"high\" decoding=\"async\" src=\"https:\/\/griddb.net\/wp-content\/uploads\/2020\/12\/output_40_1.png\" alt=\"\" width=\"1162\" height=\"464\" class=\"aligncenter size-full wp-image-27139\" srcset=\"\/wp-content\/uploads\/2020\/12\/output_40_1.png 1162w, \/wp-content\/uploads\/2020\/12\/output_40_1-300x120.png 300w, \/wp-content\/uploads\/2020\/12\/output_40_1-768x307.png 768w, \/wp-content\/uploads\/2020\/12\/output_40_1-1024x409.png 1024w, \/wp-content\/uploads\/2020\/12\/output_40_1-600x240.png 600w\" sizes=\"(max-width: 1162px) 100vw, 1162px\" \/><\/a><\/p>\n<p>If we look at this chart &#8230; we will be able to recognize quite a little. This is an outburst in energy consumption on a couple of days, but besides that, the power usage seems to have a flat trend. It demonstrates the same amount each day, with <strong>nightly drops and two-days gaps that are weekends.<\/strong> <\/p>\n<p>We cannot identify any differences between single days and nights. We do see that during the day hours, there is a curve, not a line. Moreover, our intuition says that during the night time, power usage cannot be zero all the time. You have heating and air conditioning systems. Their usage strongly depends on the temperature outside, which cannot be the same every day.<br \/>\nStill, any chance of finding a <strong>pattern<\/strong> here? Sure, but we have to do more than a simple line chart.<\/p>\n<h2 id=\"finding-patterns\">Finding Patterns<\/h2>\n<h3 id=\"rolling-mean\">Rolling Mean<\/h3>\n<p>The nightly gaps make daily consumption during the peaks look &quot;flat.&quot; We want to find a trend to recognize differences in day-to-day numbers. For this, we will flatten the line even more by adding a rolling mean. We take the average data consumption for every 24 hours and assign it to any single time point within one day.<\/p>\n<p>Since the timestamps were made each 30 seconds, we calculate the mean alongside 30 <em> 60 <\/em> 24 = 2880 data rows.<\/p>\n<p>We also have to set and then reset index since the rolling window will destroy other columns, including the datetime column that we still need to build the chart.<\/p>\n<pre><code class=\"lang-python\"><span class=\"hljs-attr\">timeseriesdf<\/span> = sql_query.drop(<span class=\"hljs-attr\">columns<\/span> = ['timestamp', 'timestamp2' ])\n<span class=\"hljs-attr\">timeseriesdf<\/span> = timeseriesdf.set_index('datetime')\n<span class=\"hljs-attr\">timeseriesdf<\/span> = timeseriesdf.rolling(<span class=\"hljs-number\">2880<\/span>).mean()\ntimeseriesdf.reset_index(<span class=\"hljs-attr\">inplace=True)<\/span>\ntimeseriesdf.tail()\n<\/code><\/pre>\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n      vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n      vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n      text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th><\/th>\n<th>datetime<\/th>\n<th>power<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>86187<\/th>\n<td>2014-07-01 03:57:47<\/td>\n<td>1210.53<\/td>\n<\/tr>\n<tr>\n<th>86188<\/th>\n<td>2014-07-01 03:58:17<\/td>\n<td>1210.52<\/td>\n<\/tr>\n<tr>\n<th>86189<\/th>\n<td>2014-07-01 03:58:47<\/td>\n<td>1210.54<\/td>\n<\/tr>\n<tr>\n<th>86190<\/th>\n<td>2014-07-01 03:59:17<\/td>\n<td>1210.53<\/td>\n<\/tr>\n<tr>\n<th>86191<\/th>\n<td>2014-07-01 03:59:47<\/td>\n<td>1210.53<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p>After the second data frame is ready, we can plot the average data and lay it above the raw data.<\/p>\n<pre><code class=\"lang-python\">ax = sql_query.plot(y = <span class=\"hljs-string\">'power'<\/span>, ylabel = <span class=\"hljs-string\">'power'<\/span>, x = <span class=\"hljs-string\">'datetime'<\/span>, grid=True, <span class=\"hljs-keyword\">label<\/span><span class=\"bash\"> = <span class=\"hljs-string\">\"raw power data\"<\/span>)\n<\/span>timeseriesdf.plot(ax = ax, y = <span class=\"hljs-string\">'power'<\/span>, x = <span class=\"hljs-string\">'datetime'<\/span>, <span class=\"hljs-keyword\">label<\/span><span class=\"bash\"> = <span class=\"hljs-string\">\"rolling mean\"<\/span>)<\/span>\n<\/code><\/pre>\n<pre><code>&lt;<span class=\"hljs-string\">AxesSubplot:<\/span>xlabel=<span class=\"hljs-string\">'datetime'<\/span>, ylabel=<span class=\"hljs-string\">'power'<\/span>&gt;\n<\/code><\/pre>\n<p><a href=\"https:\/\/griddb.net\/wp-content\/uploads\/2020\/12\/output_47_1.png\"><img decoding=\"async\" src=\"https:\/\/griddb.net\/wp-content\/uploads\/2020\/12\/output_47_1.png\" alt=\"\" width=\"1176\" height=\"464\" class=\"aligncenter size-full wp-image-27138\" srcset=\"\/wp-content\/uploads\/2020\/12\/output_47_1.png 1176w, \/wp-content\/uploads\/2020\/12\/output_47_1-300x118.png 300w, \/wp-content\/uploads\/2020\/12\/output_47_1-768x303.png 768w, \/wp-content\/uploads\/2020\/12\/output_47_1-1024x404.png 1024w, \/wp-content\/uploads\/2020\/12\/output_47_1-600x237.png 600w\" sizes=\"(max-width: 1176px) 100vw, 1176px\" \/><\/a><\/p>\n<p>Now we can see that the daily usage is not the same for each day. We see the weekend gaps, as usual, then the usage grows a bit and has its peaks on Tuesdays. And there is one week when we have no Tuesday peak, but then a power usage outburst on the weekend.<\/p>\n<h3 id=\"heat-map\">Heat Map<\/h3>\n<p>Since we have identified a vague pattern in power consumption depending on the hour and weekday, we can now take a deep dive following this direction.<\/p>\n<p>There is nothing left that we can get from a line plot, so we switch to aggregating data and using heat maps. Fits well with the type of data source: power usage! Let&#39;s have a look at how much energy is being burnt in this building.<\/p>\n<h3 id=\"preparing-the-aggregation-dimensions\">Preparing the Aggregation Dimensions<\/h3>\n<p>We need to create two dimensions to aggregate the data:<\/p>\n<ul>\n<li>hour<\/li>\n<li>day of week<\/li>\n<\/ul>\n<p>It is possible to extract the hour directly from the datetime column. We will add &quot;0&quot; to the hours that are less than 10 to be able to sort this column later in a meaningful manner.<\/p>\n<p>To get weekdays, we need to do some preparation: convert datetime into date and then extract the weekday. We will also give readable names to the weekdays. For the last part, we will create a small data frame and then merge it with the main one.<\/p>\n<h4 id=\"hour-of-the-day\">Hour of the day<\/h4>\n<div class=\"clipboard\">\n<pre><code class=\"lang-python\">sql_query[<span class=\"hljs-string\">'hour'<\/span>] = sql_query[<span class=\"hljs-string\">'datetime'<\/span>].apply(lambda x: <span class=\"hljs-string\">\"0\"<\/span> + <span class=\"hljs-built_in\">str<\/span>(x.<span class=\"hljs-built_in\">hour<\/span>) <span class=\"hljs-keyword\">if<\/span> x.<span class=\"hljs-built_in\">hour<\/span> &lt; <span class=\"hljs-number\">10<\/span> <span class=\"hljs-keyword\">else<\/span> <span class=\"hljs-built_in\">str<\/span>(x.<span class=\"hljs-built_in\">hour<\/span>))\nsql_query.head()\n<\/code><\/pre>\n<\/div>\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n      vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n      vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n      text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th><\/th>\n<th>power<\/th>\n<th>timestamp<\/th>\n<th>timestamp2<\/th>\n<th>datetime<\/th>\n<th>hour<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>0<\/th>\n<td>0.00<\/td>\n<td>1401595223<\/td>\n<td>1401595223<\/td>\n<td>2014-06-01 04:00:23<\/td>\n<td>04<\/td>\n<\/tr>\n<tr>\n<th>1<\/th>\n<td>0.00<\/td>\n<td>1401595253<\/td>\n<td>1401595253<\/td>\n<td>2014-06-01 04:00:53<\/td>\n<td>04<\/td>\n<\/tr>\n<tr>\n<th>2<\/th>\n<td>0.00<\/td>\n<td>1401595283<\/td>\n<td>1401595283<\/td>\n<td>2014-06-01 04:01:23<\/td>\n<td>04<\/td>\n<\/tr>\n<tr>\n<th>3<\/th>\n<td>0.00<\/td>\n<td>1401595313<\/td>\n<td>1401595313<\/td>\n<td>2014-06-01 04:01:53<\/td>\n<td>04<\/td>\n<\/tr>\n<tr>\n<th>4<\/th>\n<td>0.00<\/td>\n<td>1401595343<\/td>\n<td>1401595343<\/td>\n<td>2014-06-01 04:02:23<\/td>\n<td>04<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<h4 id=\"day-of-week\">Day of week<\/h4>\n<div class=\"clipboard\">\n<pre><code class=\"lang-python\">import datetime\nsql_query[<span class=\"hljs-string\">'date'<\/span>] = sql_query[<span class=\"hljs-string\">'datetime'<\/span>].apply(lambda x: datetime<span class=\"hljs-selector-class\">.datetime<\/span><span class=\"hljs-selector-class\">.date<\/span>(x))\nsql_query[<span class=\"hljs-string\">'weekday'<\/span>] = sql_query[<span class=\"hljs-string\">'date'<\/span>].apply(lambda x: str(datetime<span class=\"hljs-selector-class\">.datetime<\/span><span class=\"hljs-selector-class\">.weekday<\/span>(x)))\n<\/code><\/pre>\n<pre><code class=\"lang-python\"><span class=\"hljs-attr\">weekdaydf<\/span> = pd.DataFrame({<span class=\"hljs-string\">'day_of_week'<\/span>: [<span class=\"hljs-string\">\"Mon\"<\/span>, <span class=\"hljs-string\">\"Tue\"<\/span>, <span class=\"hljs-string\">\"Wed\"<\/span>, <span class=\"hljs-string\">\"Thu\"<\/span>, <span class=\"hljs-string\">\"Fr\"<\/span>, <span class=\"hljs-string\">\"Sat\"<\/span>, <span class=\"hljs-string\">\"Sun\"<\/span>], <span class=\"hljs-string\">'weekday'<\/span>:[<span class=\"hljs-string\">\"0\"<\/span>, <span class=\"hljs-string\">\"1\"<\/span>, <span class=\"hljs-string\">\"2\"<\/span>, <span class=\"hljs-string\">\"3\"<\/span>, <span class=\"hljs-string\">\"4\"<\/span>, <span class=\"hljs-string\">\"5\"<\/span>, <span class=\"hljs-string\">\"6\"<\/span>]})\n<span class=\"hljs-attr\">power<\/span> = pd.merge(sql_query, weekdaydf, <span class=\"hljs-literal\">on<\/span>=<span class=\"hljs-string\">'weekday'<\/span>)\n<span class=\"hljs-attr\">power<\/span> = power.drop(columns=[<span class=\"hljs-string\">'timestamp'<\/span>, <span class=\"hljs-string\">'timestamp2'<\/span>, <span class=\"hljs-string\">'datetime'<\/span>, <span class=\"hljs-string\">'date'<\/span>, <span class=\"hljs-string\">'weekday'<\/span>])\n<\/code><\/pre>\n<\/div>\n<h4 id=\"group-and-aggregate\">Group and Aggregate<\/h4>\n<p>We need two steps to create an aggregation that can be used to build a heat map:<\/p>\n<ul>\n<li>\n    group the dataframe with groupby() and sum() functions\n  <\/li>\n<li>\n    create a pivot table out of it\n  <\/li>\n<\/ul>\n<p>Apart from this, we need to sort weekdays to set them in the right order as we know it. <\/p>\n<div class=\"clipboard\">\n<pre><code class=\"lang-python\">powergrouped = power.groupby([<span class=\"hljs-string\">'day_of_week'<\/span>, <span class=\"hljs-string\">'hour'<\/span>], as_index=False).sum()\npowerpivot = powergrouped.pivot(<span class=\"hljs-string\">'day_of_week'<\/span>,<span class=\"hljs-string\">'hour'<\/span>, <span class=\"hljs-string\">'power'<\/span>)\n<span class=\"hljs-keyword\">new<\/span><span class=\"hljs-type\">_order<\/span>= [<span class=\"hljs-string\">\"Sun\"<\/span>, <span class=\"hljs-string\">\"Sat\"<\/span>, <span class=\"hljs-string\">\"Fr\"<\/span>, <span class=\"hljs-string\">\"Thu\"<\/span>, <span class=\"hljs-string\">\"Wed\"<\/span>, <span class=\"hljs-string\">\"Tue\"<\/span>, <span class=\"hljs-string\">\"Mon\"<\/span>]\npowerpivot.sort_index(axis=<span class=\"hljs-number\">1<\/span>, ascending=True, inplace=True) <span class=\"hljs-meta\">#x-axis<\/span>\npowerpivot = powerpivot.reindex(<span class=\"hljs-keyword\">new<\/span><span class=\"hljs-type\">_order<\/span>, axis=<span class=\"hljs-number\">0<\/span>)\n<\/code><\/pre>\n<\/div>\n<div class=\"clipboard\">\n<pre><code class=\"lang-python\"><span class=\"hljs-attribute\">powergrouped<\/span>\n<\/code><\/pre>\n<\/div>\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n      vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n      vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n      text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th><\/th>\n<th>day_of_week<\/th>\n<th>hour<\/th>\n<th>power<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>0<\/th>\n<td>Fr<\/td>\n<td>00<\/td>\n<td>0.00<\/td>\n<\/tr>\n<tr>\n<th>1<\/th>\n<td>Fr<\/td>\n<td>01<\/td>\n<td>224969.23<\/td>\n<\/tr>\n<tr>\n<th>2<\/th>\n<td>Fr<\/td>\n<td>02<\/td>\n<td>864557.15<\/td>\n<\/tr>\n<tr>\n<th>3<\/th>\n<td>Fr<\/td>\n<td>03<\/td>\n<td>1119128.11<\/td>\n<\/tr>\n<tr>\n<th>4<\/th>\n<td>Fr<\/td>\n<td>04<\/td>\n<td>1150885.10<\/td>\n<\/tr>\n<tr>\n<th>&#8230;<\/th>\n<td>&#8230;<\/td>\n<td>&#8230;<\/td>\n<td>&#8230;<\/td>\n<\/tr>\n<tr>\n<th>163<\/th>\n<td>Wed<\/td>\n<td>19<\/td>\n<td>105530.85<\/td>\n<\/tr>\n<tr>\n<th>164<\/th>\n<td>Wed<\/td>\n<td>20<\/td>\n<td>0.00<\/td>\n<\/tr>\n<tr>\n<th>165<\/th>\n<td>Wed<\/td>\n<td>21<\/td>\n<td>0.00<\/td>\n<\/tr>\n<tr>\n<th>166<\/th>\n<td>Wed<\/td>\n<td>22<\/td>\n<td>0.00<\/td>\n<\/tr>\n<tr>\n<th>167<\/th>\n<td>Wed<\/td>\n<td>23<\/td>\n<td>0.00<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>  168 rows \u00c3\u2014 3 columns\n<\/p><\/div>\n<div class=\"clipboard\">\n<pre><code class=\"lang-python\"><span class=\"hljs-attribute\">powerpivot<\/span>\n<\/code><\/pre>\n<\/div>\n<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n      vertical-align: middle;\n    }<\/p>\n<p>    .dataframe tbody tr th {\n      vertical-align: top;\n    }<\/p>\n<p>    .dataframe thead th {\n      text-align: right;\n    }\n  <\/style>\n<table border=\"1\" class=\"dataframe\">\n<thead>\n<tr style=\"text-align: right;\">\n<th>hour<\/th>\n<th>00<\/th>\n<th>01<\/th>\n<th>02<\/th>\n<th>03<\/th>\n<th>04<\/th>\n<th>05<\/th>\n<th>06<\/th>\n<th>07<\/th>\n<th>08<\/th>\n<th>09<\/th>\n<th>&#8230;<\/th>\n<th>14<\/th>\n<th>15<\/th>\n<th>16<\/th>\n<th>17<\/th>\n<th>18<\/th>\n<th>19<\/th>\n<th>20<\/th>\n<th>21<\/th>\n<th>22<\/th>\n<th>23<\/th>\n<\/tr>\n<tr>\n<th>day_of_week<\/th>\n<th><\/th>\n<th><\/th>\n<th><\/th>\n<th><\/th>\n<th><\/th>\n<th><\/th>\n<th><\/th>\n<th><\/th>\n<th><\/th>\n<th><\/th>\n<th><\/th>\n<th><\/th>\n<th><\/th>\n<th><\/th>\n<th><\/th>\n<th><\/th>\n<th><\/th>\n<th><\/th>\n<th><\/th>\n<th><\/th>\n<th><\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr>\n<th>Sun<\/th>\n<td>0.00<\/td>\n<td>51497.60<\/td>\n<td>291117.90<\/td>\n<td>292769.57<\/td>\n<td>291190.81<\/td>\n<td>291278.86<\/td>\n<td>290997.18<\/td>\n<td>491656.62<\/td>\n<td>513788.41<\/td>\n<td>510171.87<\/td>\n<td>&#8230;<\/td>\n<td>213315.17<\/td>\n<td>288440.83<\/td>\n<td>263605.47<\/td>\n<td>208461.53<\/td>\n<td>211521.67<\/td>\n<td>211974.35<\/td>\n<td>212378.75<\/td>\n<td>210879.87<\/td>\n<td>209623.45<\/td>\n<td>206585.15<\/td>\n<\/tr>\n<tr>\n<th>Sat<\/th>\n<td>0.00<\/td>\n<td>0.00<\/td>\n<td>269834.76<\/td>\n<td>458969.12<\/td>\n<td>915983.93<\/td>\n<td>918758.96<\/td>\n<td>921725.76<\/td>\n<td>999486.05<\/td>\n<td>1206349.12<\/td>\n<td>1210010.29<\/td>\n<td>&#8230;<\/td>\n<td>0.00<\/td>\n<td>0.00<\/td>\n<td>0.00<\/td>\n<td>0.00<\/td>\n<td>0.00<\/td>\n<td>0.00<\/td>\n<td>0.00<\/td>\n<td>0.00<\/td>\n<td>0.00<\/td>\n<td>0.00<\/td>\n<\/tr>\n<tr>\n<th>Fr<\/th>\n<td>0.00<\/td>\n<td>224969.23<\/td>\n<td>864557.15<\/td>\n<td>1119128.11<\/td>\n<td>1150885.10<\/td>\n<td>1157386.84<\/td>\n<td>1156024.29<\/td>\n<td>1153747.56<\/td>\n<td>1158855.92<\/td>\n<td>1158471.02<\/td>\n<td>&#8230;<\/td>\n<td>181962.75<\/td>\n<td>120546.90<\/td>\n<td>650737.81<\/td>\n<td>633932.26<\/td>\n<td>132555.43<\/td>\n<td>0.00<\/td>\n<td>0.00<\/td>\n<td>0.00<\/td>\n<td>0.00<\/td>\n<td>0.00<\/td>\n<\/tr>\n<tr>\n<th>Thu<\/th>\n<td>0.00<\/td>\n<td>309328.00<\/td>\n<td>932204.37<\/td>\n<td>1205611.23<\/td>\n<td>1207128.36<\/td>\n<td>1204462.24<\/td>\n<td>1206721.36<\/td>\n<td>1208640.58<\/td>\n<td>1204511.73<\/td>\n<td>1199805.57<\/td>\n<td>&#8230;<\/td>\n<td>287940.39<\/td>\n<td>287652.30<\/td>\n<td>290639.87<\/td>\n<td>287504.60<\/td>\n<td>65143.31<\/td>\n<td>0.00<\/td>\n<td>0.00<\/td>\n<td>0.00<\/td>\n<td>0.00<\/td>\n<td>0.00<\/td>\n<\/tr>\n<tr>\n<th>Wed<\/th>\n<td>0.00<\/td>\n<td>281027.87<\/td>\n<td>749927.37<\/td>\n<td>1100732.95<\/td>\n<td>1153720.09<\/td>\n<td>1151129.83<\/td>\n<td>1153740.49<\/td>\n<td>1156341.91<\/td>\n<td>1153000.58<\/td>\n<td>1147049.35<\/td>\n<td>&#8230;<\/td>\n<td>0.00<\/td>\n<td>296933.91<\/td>\n<td>861515.44<\/td>\n<td>618844.71<\/td>\n<td>372767.37<\/td>\n<td>105530.85<\/td>\n<td>0.00<\/td>\n<td>0.00<\/td>\n<td>0.00<\/td>\n<td>0.00<\/td>\n<\/tr>\n<tr>\n<th>Tue<\/th>\n<td>26562.56<\/td>\n<td>445447.54<\/td>\n<td>683084.53<\/td>\n<td>1342209.28<\/td>\n<td>1151558.18<\/td>\n<td>1147726.28<\/td>\n<td>1149619.97<\/td>\n<td>1150122.07<\/td>\n<td>1152870.25<\/td>\n<td>1148160.37<\/td>\n<td>&#8230;<\/td>\n<td>282747.19<\/td>\n<td>298448.77<\/td>\n<td>612924.25<\/td>\n<td>572219.97<\/td>\n<td>545555.71<\/td>\n<td>284165.57<\/td>\n<td>286591.83<\/td>\n<td>286551.77<\/td>\n<td>219794.28<\/td>\n<td>0.00<\/td>\n<\/tr>\n<tr>\n<th>Mon<\/th>\n<td>206121.08<\/td>\n<td>419298.17<\/td>\n<td>1068524.79<\/td>\n<td>1550715.82<\/td>\n<td>1630477.85<\/td>\n<td>1630106.16<\/td>\n<td>1680749.36<\/td>\n<td>1412979.20<\/td>\n<td>1332721.77<\/td>\n<td>1338198.34<\/td>\n<td>&#8230;<\/td>\n<td>0.00<\/td>\n<td>0.00<\/td>\n<td>33988.80<\/td>\n<td>290805.45<\/td>\n<td>291424.27<\/td>\n<td>291727.70<\/td>\n<td>242969.68<\/td>\n<td>0.00<\/td>\n<td>0.00<\/td>\n<td>0.00<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>  7 rows \u00c3\u2014 24 columns\n<\/p><\/div>\n<h3 id=\"analyzing-the-pattern\">Analyzing the Pattern<\/h3>\n<div class=\"clipboard\">\n<pre><code class=\"lang-python\"><span class=\"hljs-keyword\">import<\/span> seaborn <span class=\"hljs-keyword\">as<\/span> sns\n<span class=\"hljs-title\">ax<\/span> = sns.heatmap(powerpivot)\n<\/code><\/pre>\n<\/div>\n<p><a href=\"https:\/\/griddb.net\/wp-content\/uploads\/2020\/12\/output_64_0.png\"><img decoding=\"async\" src=\"https:\/\/griddb.net\/wp-content\/uploads\/2020\/12\/output_64_0.png\" alt=\"\" width=\"1039\" height=\"491\" class=\"aligncenter size-full wp-image-27137\" srcset=\"\/wp-content\/uploads\/2020\/12\/output_64_0.png 1039w, \/wp-content\/uploads\/2020\/12\/output_64_0-300x142.png 300w, \/wp-content\/uploads\/2020\/12\/output_64_0-768x363.png 768w, \/wp-content\/uploads\/2020\/12\/output_64_0-1024x484.png 1024w, \/wp-content\/uploads\/2020\/12\/output_64_0-600x284.png 600w\" sizes=\"(max-width: 1039px) 100vw, 1039px\" \/><\/a><\/p>\n<p>Since we have smart metering data from a university campus, it seems natural that the usage drops starting from noon. The classes take place in the mornings. <\/p>\n<p>The weekly peak is Wednesday: Our prophecy from before was almost correct. Although it is not quite clear. On Tuesdays, more hours are marked with purple, but the red one lies on Wednesday.<\/p>\n<p>And why the hottest spots are between 3 and 7 a.m.? Who needs so much power in the dawn?<\/p>\n<h2 id=\"bonus-grouped-line-charts\">Bonus: Grouped Line Charts<\/h2>\n<p>We could have built a simple multiple line chart like the one below:<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"lang-python\"><span class=\"hljs-symbol\">sns.lineplot<\/span>(<span class=\"hljs-meta\">data<\/span><span class=\"hljs-symbol\">=powergrouped<\/span>, x=<span class=\"hljs-string\">\"hour\"<\/span>, y=<span class=\"hljs-string\">\"power\"<\/span>, hue=<span class=\"hljs-string\">\"day_of_week\"<\/span>)\n<\/code><\/pre>\n<\/div>\n<pre><code>&lt;<span class=\"hljs-string\">AxesSubplot:<\/span>xlabel=<span class=\"hljs-string\">'hour'<\/span>, ylabel=<span class=\"hljs-string\">'power'<\/span>&gt;\n<\/code><\/pre>\n<p><a href=\"https:\/\/griddb.net\/wp-content\/uploads\/2020\/12\/output_68_1.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/griddb.net\/wp-content\/uploads\/2020\/12\/output_68_1.png\" alt=\"\" width=\"1181\" height=\"496\" class=\"aligncenter size-full wp-image-27136\" srcset=\"\/wp-content\/uploads\/2020\/12\/output_68_1.png 1181w, \/wp-content\/uploads\/2020\/12\/output_68_1-300x126.png 300w, \/wp-content\/uploads\/2020\/12\/output_68_1-768x323.png 768w, \/wp-content\/uploads\/2020\/12\/output_68_1-1024x430.png 1024w, \/wp-content\/uploads\/2020\/12\/output_68_1-600x252.png 600w\" sizes=\"(max-width: 1181px) 100vw, 1181px\" \/><\/a><\/p>\n<p>The problem is that you have to consult with the legend each time you&#39;ve noticed something interesting since it is difficult to keep in mind which line is which.<br \/>\nA better supplement to the heat map is a grid with small multiple time series charts. They separate each weekday&#39;s trend and help to focus on peaks and gaps.<\/p>\n<div class=\"clipboard\">\n<pre><code class=\"lang-python\">powergrouped['day_of_week'] = pd.Categorical(powergrouped['day_of_week'], new_order[::-<span class=\"hljs-number\">1<\/span>])\n<span class=\"hljs-attr\">powergrouped<\/span> = powergrouped.sort_values(<span class=\"hljs-attr\">by<\/span> = 'day_of_week')\n\n<span class=\"hljs-attr\">g<\/span> = sns.relplot(\n    <span class=\"hljs-attr\">data=powergrouped,<\/span>\n    <span class=\"hljs-attr\">x=\"hour\",<\/span> <span class=\"hljs-attr\">y=\"power\",<\/span> <span class=\"hljs-attr\">col=\"day_of_week\",<\/span> <span class=\"hljs-attr\">hue=\"day_of_week\",<\/span>\n    <span class=\"hljs-attr\">kind=\"line\",<\/span> <span class=\"hljs-attr\">palette=\"crest\",<\/span> <span class=\"hljs-attr\">linewidth=4,<\/span> <span class=\"hljs-attr\">zorder=5,<\/span>\n    <span class=\"hljs-attr\">col_wrap=3,<\/span> <span class=\"hljs-attr\">height=2,<\/span> <span class=\"hljs-attr\">aspect=1.5,<\/span> <span class=\"hljs-attr\">legend=False,<\/span>\n)\n\nfor day_of_week, ax <span class=\"hljs-keyword\">in<\/span> g.axes_dict.items():\n\n\n    ax.text(.<span class=\"hljs-number\">8<\/span>, .<span class=\"hljs-number\">85<\/span>, day_of_week, <span class=\"hljs-attr\">transform=ax.transAxes,<\/span> <span class=\"hljs-attr\">fontweight=\"bold\")<\/span>\n\n\n    sns.lineplot(\n        <span class=\"hljs-attr\">data=powergrouped,<\/span> <span class=\"hljs-attr\">x=\"hour\",<\/span> <span class=\"hljs-attr\">y=\"power\",<\/span> <span class=\"hljs-attr\">units=\"day_of_week\",<\/span>\n        <span class=\"hljs-attr\">estimator=None,<\/span> <span class=\"hljs-attr\">color=\".7\",<\/span> <span class=\"hljs-attr\">linewidth=1,<\/span> <span class=\"hljs-attr\">ax=ax,<\/span>\n    )\n\nax.set_xticks(ax.get_xticks()[::<span class=\"hljs-number\">4<\/span>])\n\n\ng.set_titles(<span class=\"hljs-string\">\"\"<\/span>)\ng.set_axis_labels(<span class=\"hljs-string\">\"\"<\/span>, <span class=\"hljs-string\">\"power\"<\/span>)\n<\/code><\/pre>\n<\/div>\n<pre><code>&lt;seaborn<span class=\"hljs-selector-class\">.axisgrid<\/span><span class=\"hljs-selector-class\">.FacetGrid<\/span> at <span class=\"hljs-number\">0<\/span>x7f105ecbf8b0&gt;\n<\/code><\/pre>\n<p><a href=\"https:\/\/griddb.net\/wp-content\/uploads\/2020\/12\/output_70_1.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/griddb.net\/wp-content\/uploads\/2020\/12\/output_70_1.png\" alt=\"\" width=\"653\" height=\"415\" class=\"aligncenter size-full wp-image-27135\" srcset=\"\/wp-content\/uploads\/2020\/12\/output_70_1.png 653w, \/wp-content\/uploads\/2020\/12\/output_70_1-300x191.png 300w, \/wp-content\/uploads\/2020\/12\/output_70_1-600x381.png 600w\" sizes=\"(max-width: 653px) 100vw, 653px\" \/><\/a><\/p>\n<p>These last visualizations emphasize the fact that power usage is highly dependent on both day of week and hour of the day.<\/p>\n<h2 id=\"conclusion-energy-saving-potential\">Conclusion: Energy Saving Potential<\/h2>\n<p>Going back to the aim of our analysis, we can propose a few directions for closer investigation:<\/p>\n<ul>\n<li>power usage on Monday mornings: reduce by heating a bit more on Sundays<\/li>\n<li>check for activities on Wednesday, Tuesday, and Friday noons: are there classes?<\/li>\n<\/ul>\n<p>The best strategy is to flatten power usage since each drop is always followed by a strong increase. If you let the heating on a low level for the weekend, you do not need to heat like crazy on Mondays.<\/p>\n<h2 id=\"a-short-afterward\">A Short Afterward<\/h2>\n<p>You can use different methods to catch patterns in your time series data, like rolling windows and aggregations. Heat maps and multiple line charts will help to uncover trends and localize data outliers. <\/p>\n<p>Saving data to a database instead of storing it in .CSV files permanently has a lot of advantages:<\/p>\n<ul>\n<li>You do not have to create numerous folders and flood your file system.<\/li>\n<li>You have enough free space on your hard drive.<\/li>\n<li>\n    Your data is safer in a database than if it is saved locally.<br \/>\n    Happy plotting!\n  <\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Introduction and Purpose The Use Case Imagine that you are the owner of a building complex. You have installed smart meters all over the place to monitor power usage. Each of the devices creates a timestamp two times in a minute. It saves power usage data in kW and timestamps in the form of epoch [&hellip;]<\/p>\n","protected":false},"author":41,"featured_media":27136,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[121],"tags":[],"class_list":["post-46625","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-blog"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.1.1 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Making Sense Out of Smart Metering Data With GridDB and Jupyter Notebook | GridDB: Open Source Time Series Database for IoT<\/title>\n<meta name=\"description\" content=\"Introduction and Purpose The Use Case Imagine that you are the owner of a building complex. You have installed smart meters all over the place to monitor\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.griddb.net\/en\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Making Sense Out of Smart Metering Data With GridDB and Jupyter Notebook | GridDB: Open Source Time Series Database for IoT\" \/>\n<meta property=\"og:description\" content=\"Introduction and Purpose The Use Case Imagine that you are the owner of a building complex. You have installed smart meters all over the place to monitor\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.griddb.net\/en\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/\" \/>\n<meta property=\"og:site_name\" content=\"GridDB: Open Source Time Series Database for IoT\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/griddbcommunity\/\" \/>\n<meta property=\"article:published_time\" content=\"2020-12-18T08:00:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-11-13T20:55:07+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/wp-content\/uploads\/2020\/12\/output_68_1.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1181\" \/>\n\t<meta property=\"og:image:height\" content=\"496\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"griddb-admin\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@GridDBCommunity\" \/>\n<meta name=\"twitter:site\" content=\"@GridDBCommunity\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"griddb-admin\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"12 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.griddb.net\/en\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.griddb.net\/en\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/\"},\"author\":{\"name\":\"griddb-admin\",\"@id\":\"https:\/\/griddb.net\/en\/#\/schema\/person\/4fe914ca9576878e82f5e8dd3ba52233\"},\"headline\":\"Making Sense Out of Smart Metering Data With GridDB and Jupyter Notebook\",\"datePublished\":\"2020-12-18T08:00:00+00:00\",\"dateModified\":\"2025-11-13T20:55:07+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.griddb.net\/en\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/\"},\"wordCount\":1823,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/griddb.net\/en\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.griddb.net\/en\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/#primaryimage\"},\"thumbnailUrl\":\"\/wp-content\/uploads\/2020\/12\/output_68_1.png\",\"articleSection\":[\"Blog\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.griddb.net\/en\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.griddb.net\/en\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/\",\"url\":\"https:\/\/www.griddb.net\/en\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/\",\"name\":\"Making Sense Out of Smart Metering Data With GridDB and Jupyter Notebook | GridDB: Open Source Time Series Database for IoT\",\"isPartOf\":{\"@id\":\"https:\/\/griddb.net\/en\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.griddb.net\/en\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.griddb.net\/en\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/#primaryimage\"},\"thumbnailUrl\":\"\/wp-content\/uploads\/2020\/12\/output_68_1.png\",\"datePublished\":\"2020-12-18T08:00:00+00:00\",\"dateModified\":\"2025-11-13T20:55:07+00:00\",\"description\":\"Introduction and Purpose The Use Case Imagine that you are the owner of a building complex. You have installed smart meters all over the place to monitor\",\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.griddb.net\/en\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.griddb.net\/en\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/#primaryimage\",\"url\":\"\/wp-content\/uploads\/2020\/12\/output_68_1.png\",\"contentUrl\":\"\/wp-content\/uploads\/2020\/12\/output_68_1.png\",\"width\":1181,\"height\":496},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/griddb.net\/en\/#website\",\"url\":\"https:\/\/griddb.net\/en\/\",\"name\":\"GridDB: Open Source Time Series Database for IoT\",\"description\":\"GridDB is an open source time-series database with the performance of NoSQL and convenience of SQL\",\"publisher\":{\"@id\":\"https:\/\/griddb.net\/en\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/griddb.net\/en\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/griddb.net\/en\/#organization\",\"name\":\"Fixstars\",\"url\":\"https:\/\/griddb.net\/en\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/griddb.net\/en\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png\",\"contentUrl\":\"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png\",\"width\":200,\"height\":83,\"caption\":\"Fixstars\"},\"image\":{\"@id\":\"https:\/\/griddb.net\/en\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.facebook.com\/griddbcommunity\/\",\"https:\/\/x.com\/GridDBCommunity\",\"https:\/\/www.linkedin.com\/company\/griddb-by-toshiba\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/griddb.net\/en\/#\/schema\/person\/4fe914ca9576878e82f5e8dd3ba52233\",\"name\":\"griddb-admin\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/griddb.net\/en\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/5bceca1cafc06886a7ba873e2f0a28011a1176c4dea59709f735b63ae30d0342?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/5bceca1cafc06886a7ba873e2f0a28011a1176c4dea59709f735b63ae30d0342?s=96&d=mm&r=g\",\"caption\":\"griddb-admin\"},\"url\":\"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/author\/griddb-admin\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Making Sense Out of Smart Metering Data With GridDB and Jupyter Notebook | GridDB: Open Source Time Series Database for IoT","description":"Introduction and Purpose The Use Case Imagine that you are the owner of a building complex. You have installed smart meters all over the place to monitor","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.griddb.net\/en\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/","og_locale":"en_US","og_type":"article","og_title":"Making Sense Out of Smart Metering Data With GridDB and Jupyter Notebook | GridDB: Open Source Time Series Database for IoT","og_description":"Introduction and Purpose The Use Case Imagine that you are the owner of a building complex. You have installed smart meters all over the place to monitor","og_url":"https:\/\/www.griddb.net\/en\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/","og_site_name":"GridDB: Open Source Time Series Database for IoT","article_publisher":"https:\/\/www.facebook.com\/griddbcommunity\/","article_published_time":"2020-12-18T08:00:00+00:00","article_modified_time":"2025-11-13T20:55:07+00:00","og_image":[{"width":1181,"height":496,"url":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/wp-content\/uploads\/2020\/12\/output_68_1.png","type":"image\/png"}],"author":"griddb-admin","twitter_card":"summary_large_image","twitter_creator":"@GridDBCommunity","twitter_site":"@GridDBCommunity","twitter_misc":{"Written by":"griddb-admin","Est. reading time":"12 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.griddb.net\/en\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/#article","isPartOf":{"@id":"https:\/\/www.griddb.net\/en\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/"},"author":{"name":"griddb-admin","@id":"https:\/\/griddb.net\/en\/#\/schema\/person\/4fe914ca9576878e82f5e8dd3ba52233"},"headline":"Making Sense Out of Smart Metering Data With GridDB and Jupyter Notebook","datePublished":"2020-12-18T08:00:00+00:00","dateModified":"2025-11-13T20:55:07+00:00","mainEntityOfPage":{"@id":"https:\/\/www.griddb.net\/en\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/"},"wordCount":1823,"commentCount":0,"publisher":{"@id":"https:\/\/griddb.net\/en\/#organization"},"image":{"@id":"https:\/\/www.griddb.net\/en\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/#primaryimage"},"thumbnailUrl":"\/wp-content\/uploads\/2020\/12\/output_68_1.png","articleSection":["Blog"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.griddb.net\/en\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.griddb.net\/en\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/","url":"https:\/\/www.griddb.net\/en\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/","name":"Making Sense Out of Smart Metering Data With GridDB and Jupyter Notebook | GridDB: Open Source Time Series Database for IoT","isPartOf":{"@id":"https:\/\/griddb.net\/en\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.griddb.net\/en\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/#primaryimage"},"image":{"@id":"https:\/\/www.griddb.net\/en\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/#primaryimage"},"thumbnailUrl":"\/wp-content\/uploads\/2020\/12\/output_68_1.png","datePublished":"2020-12-18T08:00:00+00:00","dateModified":"2025-11-13T20:55:07+00:00","description":"Introduction and Purpose The Use Case Imagine that you are the owner of a building complex. You have installed smart meters all over the place to monitor","inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.griddb.net\/en\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.griddb.net\/en\/blog\/making-sense-out-of-smart-metering-data-with-griddb-and-jupyter-notebook\/#primaryimage","url":"\/wp-content\/uploads\/2020\/12\/output_68_1.png","contentUrl":"\/wp-content\/uploads\/2020\/12\/output_68_1.png","width":1181,"height":496},{"@type":"WebSite","@id":"https:\/\/griddb.net\/en\/#website","url":"https:\/\/griddb.net\/en\/","name":"GridDB: Open Source Time Series Database for IoT","description":"GridDB is an open source time-series database with the performance of NoSQL and convenience of SQL","publisher":{"@id":"https:\/\/griddb.net\/en\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/griddb.net\/en\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/griddb.net\/en\/#organization","name":"Fixstars","url":"https:\/\/griddb.net\/en\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/griddb.net\/en\/#\/schema\/logo\/image\/","url":"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png","contentUrl":"https:\/\/griddb.net\/wp-content\/uploads\/2019\/04\/fixstars_logo_web_tagline.png","width":200,"height":83,"caption":"Fixstars"},"image":{"@id":"https:\/\/griddb.net\/en\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/griddbcommunity\/","https:\/\/x.com\/GridDBCommunity","https:\/\/www.linkedin.com\/company\/griddb-by-toshiba"]},{"@type":"Person","@id":"https:\/\/griddb.net\/en\/#\/schema\/person\/4fe914ca9576878e82f5e8dd3ba52233","name":"griddb-admin","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/griddb.net\/en\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/5bceca1cafc06886a7ba873e2f0a28011a1176c4dea59709f735b63ae30d0342?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/5bceca1cafc06886a7ba873e2f0a28011a1176c4dea59709f735b63ae30d0342?s=96&d=mm&r=g","caption":"griddb-admin"},"url":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/author\/griddb-admin\/"}]}},"_links":{"self":[{"href":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/wp-json\/wp\/v2\/posts\/46625","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/wp-json\/wp\/v2\/users\/41"}],"replies":[{"embeddable":true,"href":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/wp-json\/wp\/v2\/comments?post=46625"}],"version-history":[{"count":1,"href":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/wp-json\/wp\/v2\/posts\/46625\/revisions"}],"predecessor-version":[{"id":51301,"href":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/wp-json\/wp\/v2\/posts\/46625\/revisions\/51301"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/wp-json\/wp\/v2\/media\/27136"}],"wp:attachment":[{"href":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/wp-json\/wp\/v2\/media?parent=46625"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/wp-json\/wp\/v2\/categories?post=46625"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/griddb-linux-hte8hndjf8cka8ht.westus-01.azurewebsites.net\/en\/wp-json\/wp\/v2\/tags?post=46625"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}