Multi-step-Time-series-predicting using RNN LSTM

Published by berryedelson on

Household Power Consumption Prediction using RNN-LSTM

Power outage accidents will cause huge economic loss to the social economy. Therefore, it is very important to predict power consumption.

Given the rise of smart electricity meters and the wide adoption of electricity generation technology like solar panels, there is a wealth of electricity usage data available.

1.webp

Problem Statement :

Given that power consumption data for the previous week, we have to predict the power consumption for the next week.

Watch Full Video: 

Download dataset:

https://archive.ics.uci.edu/ml/machine-learning-databases/00235/household_power_consumption.zip

Details:

https://archive.ics.uci.edu/ml/datasets/individual+household+electric+power+consumption

Dataset Description:

The data was collected between December 2006 and November 2010 and observations of power consumption within the household were collected every minute.

It is a multivariate series comprised of seven variables

  • global_active_power: The total active power consumed by the household (kilowatts).
  • global_reactive_power: The total reactive power consumed by the household (kilowatts).
  • voltage: Average voltage (volts).
  • global_intensity: Average current intensity (amps).
  • sub_metering_1: Active energy for kitchen (watt-hours of active energy).
  • sub_metering_2: Active energy for laundry (watt-hours of active energy).
  • sub_metering_3: Active energy for climate control systems (watt-hours of active energy).

This data represents a multivariate time series of power-related variables that in turn could be used to model and even forecast future electricity consumption


Time-series predictions play a major role in machine learning which is often neglected. Nonetheless, there are lots of machine learning algorithms we could use for these problems. The major machine learning algorithms involving Statsmodels and Econometric models etc. Today we will take a look at how to use and apply Deep learning algorithms to predict the time series Data

Why use a Deep Learning Algorithm?

With the data volume growing enormous day by day we shouldn’t confine ourselves to only the standard ML algorithms. Deep learning algorithms help us to handle large volumes of data and without leaving the key insights and by tuning the model within the right way gives us the maximum yield i.e., in our cause maximum accuracy 😊 . The model also determines if our prediction is better or worse from its own neural network architecture.

For this Time series forecasting we will use Long- Short Term Memory unit (LSTM).

Recurrent Neural Network (RNN)

To understand an LSTM Network, we need to understand a Recurrent Neural Network first. This kind of network is used to recognize patterns when past results have influence on the present result. An example of RNN usage is the time-series functions, in which the data order is extremely important. In this network architecture, the neuron uses as input not only the regular input (the previous layer output), but also its previous state.

3.png

It is important to notice that H represents the neuron state. Therefore, when in state H_1, the neuron uses as input the parameter X_1 and H_0 (its previous state). The main problem of this model is the memory loss. The network older states are fast forgotten. In sequences where we need to remember beyond the immediate past, RNNs fail to remember.

Long Short Term Memory unit(LSTM) was typically created to overcome the limitations of a Recurrent neural network (RNN). The Typical long data sets of Time series can actually be a time-consuming process which could typically slow down the training time of RNN architecture. We could restrict the data volume but this a loss of information. And in any time-series data sets, there is a need to know the previous trends and the seasonality of data of the overall data set to make the right predictions.

Before going into the brief explanation of LSTM cell, Let us see how the LSTM cell looks like :

2.jpeg

The Architecture may look little complicated on the first glance, but it is pretty neat and clear and easily understandable if we break it into parts.

Lets first start understanding what are our inputs and outputs. The typical input if you see on the left-hand side of the diagram Ct-1 which is the previous cell state and ht-1 which is the output from the previous cell and Xt which is the input of the present cell.

The output of the cell is Ct and ht which are the corresponding cell state and output of the present cell. The first step of an LSTM is the forget gate layer (f) where we determine what are we going to forget from the previous cell state. This typically takes the input ht-1 and Xt and make a linear transformation with some weights and bias terms and pass into the sigmoid function. As we are aware the output of a sigmoid function is always between 0 and 1. Here 0 will be considered as to forget it and 1 will represent to keep it

Forget gate later=> f = Sigmoid ( Weights (ht-1,Xt) + bias)

The second step is a two-part process and this is the step which tells us actually processing within this layer. Here in the first part we take the same inputs as before the ht-1 and Xt and make a linear transformation with some weights and biases and pass on to a sigmoid function. And the second part we will make a linear transformation again between ht-1 and Xt with some weights and biases but this time its going to be a hyperbolic tangent function (tanh). At the end of this step, we will get vectors of values which can be new candidate values for this present cell.

First part => I = sigmoid( Weights (ht-1,Xt) + bias)

Second part => II = tanh( Weights (ht-1,Xt) + bias)

The third step is the update step which helps us in deriving the new cell state Ct using our previous steps. First, we will multiply the previous cell state with the forget gate layer and add the vectors we got from the second step which forms the new cell state Ct of the present cell at t.

Update layer => Ct = Ct-1 f + I II

The final step is another main output of the cell, for this, we will directly form a linear transformation with the previous output ht-1 and input of the present cell Xt with some bias and weight terms and pass on to a sigmoid layer. Finally, now we will multiply this output to the new cell state Ct which is passed on to a hyperbolic tangent function. This gives us the present output ht.

Final layer =>

i = sigmoid ( Weights (ht-1,xt) + bias)

final ht = i * tanh(Ct)

Now we have a clear understanding of the step by step dissection of the LSTM layer. Let's see how we apply our LSTM cell into a time series data.

How to? Let's Begin

Importing Libraries

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from numpy import nan

from tensorflow.keras import Sequential
from tensorflow.keras.layers import LSTM, Dense

from sklearn.metrics import mean_squared_error
from sklearn.preprocessing import MinMaxScaler
#Reading the dataset

data = pd.read_csv('household_power_consumption.txt', sep = ';',
                  parse_dates = True,
                  low_memory = False)
#printing top rows

data.head()
DateTimeGlobal_active_powerGlobal_reactive_powerVoltageGlobal_intensitySub_metering_1Sub_metering_2Sub_metering_3
016/12/200617:24:004.2160.418234.84018.4000.0001.00017.0
116/12/200617:25:005.3600.436233.63023.0000.0001.00016.0
216/12/200617:26:005.3740.498233.29023.0000.0002.00017.0
316/12/200617:27:005.3880.502233.74023.0000.0001.00017.0
416/12/200617:28:003.6660.528235.68015.8000.0001.00017.0
#concatenating the date and time columns to 'date_time' columns

data['date_time'] = data['Date'].str.cat(data['Time'], sep= ' ')
data.drop(['Date', 'Time'], inplace= True, axis = 1)
data.head()
Global_active_powerGlobal_reactive_powerVoltageGlobal_intensitySub_metering_1Sub_metering_2Sub_metering_3date_time
04.2160.418234.84018.4000.0001.00017.016/12/2006 17:24:00
15.3600.436233.63023.0000.0001.00016.016/12/2006 17:25:00
25.3740.498233.29023.0000.0002.00017.016/12/2006 17:26:00
35.3880.502233.74023.0000.0001.00017.016/12/2006 17:27:00
43.6660.528235.68015.8000.0001.00017.016/12/2006 17:28:00
data.set_index(['date_time'], inplace=True)
data.head()
Global_active_powerGlobal_reactive_powerVoltageGlobal_intensitySub_metering_1Sub_metering_2Sub_metering_3
date_time
16/12/2006 17:24:004.2160.418234.84018.4000.0001.00017.0
16/12/2006 17:25:005.3600.436233.63023.0000.0001.00016.0
16/12/2006 17:26:005.3740.498233.29023.0000.0002.00017.0
16/12/2006 17:27:005.3880.502233.74023.0000.0001.00017.0
16/12/2006 17:28:003.6660.528235.68015.8000.0001.00017.0

Next, we can mark all missing values indicated with a ‘?‘ character with a NaN value, which is a float.

#replacing each '?'characters with NaN value

data.replace('?', nan, inplace=True)
#This will allow us to work with the data as one array of floating point values rather than mixed types (less efficient.)

data = data.astype('float')
#information of the dataset

data.info()
<class 'pandas.core.frame.DataFrame'>
Index: 2075259 entries, 16/12/2006 17:24:00 to 26/11/2010 21:02:00
Data columns (total 7 columns):
Global_active_power      float64
Global_reactive_power    float64
Voltage                  float64
Global_intensity         float64
Sub_metering_1           float64
Sub_metering_2           float64
Sub_metering_3           float64
dtypes: float64(7)
memory usage: 126.7+ MB
#checking the null values

np.isnan(data).sum()
Global_active_power      25979
Global_reactive_power    25979
Voltage                  25979
Global_intensity         25979
Sub_metering_1           25979
Sub_metering_2           25979
Sub_metering_3           25979
dtype: int64

We also need to fill in the missing values now that they have been marked.

A very simple approach would be to copy the observation from the same time the day before. We can implement this in a function named fill_missing() that will take the NumPy array of the data and copy values from exactly 24 hours ago.

def fill_missing(data):
    one_day = 24*60
    for row in range(data.shape[0]):
        for col in range(data.shape[1]):
            if np.isnan(data[row, col]):
                data[row, col] = data[row-one_day, col]
fill_missing(data.values)
#checking the nan values

np.isnan(data).sum()
Global_active_power      0
Global_reactive_power    0
Voltage                  0
Global_intensity         0
Sub_metering_1           0
Sub_metering_2           0
Sub_metering_3           0
dtype: int64
data.info()
<class 'pandas.core.frame.DataFrame'>
Index: 2075259 entries, 16/12/2006 17:24:00 to 26/11/2010 21:02:00
Data columns (total 7 columns):
Global_active_power      float64
Global_reactive_power    float64
Voltage                  float64
Global_intensity         float64
Sub_metering_1           float64
Sub_metering_2           float64
Sub_metering_3           float64
dtypes: float64(7)
memory usage: 126.7+ MB
#printing the shape of the data

data.shape
(2075259, 7)

Here, we can observe that we have 2075259 datapoints and 7 features

data.head()
Global_active_powerGlobal_reactive_powerVoltageGlobal_intensitySub_metering_1Sub_metering_2Sub_metering_3
date_time
16/12/2006 17:24:004.2160.418234.8418.40.01.017.0
16/12/2006 17:25:005.3600.436233.6323.00.01.016.0
16/12/2006 17:26:005.3740.498233.2923.00.02.017.0
16/12/2006 17:27:005.3880.502233.7423.00.01.017.0
16/12/2006 17:28:003.6660.528235.6815.80.01.017.0

Prepare power consumption for each day

We can now save the cleaned-up version of the dataset to a new file; in this case we will just change the file extension to .csv and save the dataset as ‘cleaned_data.csv‘.

#conversion of dataframe to .csv

data.to_csv('cleaned_data.csv')
#reading the dataset

dataset = pd.read_csv('cleaned_data.csv', parse_dates = True, index_col = 'date_time', low_memory = False)
#printing the top rows

dataset.head()
Global_active_powerGlobal_reactive_powerVoltageGlobal_intensitySub_metering_1Sub_metering_2Sub_metering_3
date_time
2006-12-16 17:24:004.2160.418234.8418.40.01.017.0
2006-12-16 17:25:005.3600.436233.6323.00.01.016.0
2006-12-16 17:26:005.3740.498233.2923.00.02.017.0
2006-12-16 17:27:005.3880.502233.7423.00.01.017.0
2006-12-16 17:28:003.6660.528235.6815.80.01.017.0
#printing the bottom rows

dataset.tail()
Global_active_powerGlobal_reactive_powerVoltageGlobal_intensitySub_metering_1Sub_metering_2Sub_metering_3
date_time
2010-11-26 20:58:000.9460.0240.434.00.00.00.0
2010-11-26 20:59:000.9440.0240.004.00.00.00.0
2010-11-26 21:00:000.9380.0239.823.80.00.00.0
2010-11-26 21:01:000.9340.0239.703.80.00.00.0
2010-11-26 21:02:000.9320.0239.553.80.00.00.0

Exploratory Data Analysis

#Downsampling the data into dáy-wise bins and sum the values of the timestamps falling into a bin.

data = dataset.resample('D').sum()
#data after sampling it into daywise manner

data.head()
Global_active_powerGlobal_reactive_powerVoltageGlobal_intensitySub_metering_1Sub_metering_2Sub_metering_3
date_time
2006-12-161209.17634.92293552.535180.80.0546.04926.0
2006-12-173390.460226.006345725.3214398.62033.04187.013341.0
2006-12-182203.826161.792347373.649247.21063.02621.014018.0
2006-12-191666.194150.942348479.017094.0839.07602.06197.0
2006-12-202225.748160.998348923.619313.00.02648.014063.0

Plotting the all features in various time stamps

fig, ax = plt.subplots(figsize=(18,18))

for i in range(len(data.columns)):
    plt.subplot(len(data.columns), 1, i+1)
    name = data.columns[i]
    plt.plot(data[name])
    plt.title(name, y=0, loc = 'right')
    plt.yticks([])
plt.show()
fig.tight_layout()

Exploring Active power consumption for each year

#we have considered 5 years here

years = ['2007', '2008', '2009', '2010']

Year wise plotting of feature Global_active_power

fig, ax = plt.subplots(figsize=(18,18))

for i in range(len(years)):
    plt.subplot(len(years), 1, i+1)
    year = years[i]
    active_power_data = data[str(year)]
    plt.plot(active_power_data['Global_active_power'])
    plt.title(str(year), y = 0, loc = 'left')
plt.show()
fig.tight_layout()
#for year 2006

data['2006']
Global_active_powerGlobal_reactive_powerVoltageGlobal_intensitySub_metering_1Sub_metering_2Sub_metering_3
date_time
2006-12-161209.17634.92293552.535180.80.0546.04926.0
2006-12-173390.460226.006345725.3214398.62033.04187.013341.0
2006-12-182203.826161.792347373.649247.21063.02621.014018.0
2006-12-191666.194150.942348479.017094.0839.07602.06197.0
2006-12-202225.748160.998348923.619313.00.02648.014063.0
2006-12-211723.288144.434347096.417266.41765.02692.010456.0
2006-12-222341.338186.906347305.759897.03151.0350.011131.0
2006-12-234773.386221.470345795.9520200.42669.0425.014726.0
2006-12-242550.012149.900348029.9111002.21703.05082.06891.0
2006-12-252743.120240.280350495.9011450.26620.01962.05795.0
2006-12-263934.110165.102347940.6316341.01086.02533.014979.0
2006-12-271528.760178.902351025.006505.20.0314.06976.0
2006-12-282072.638208.876350306.408764.22207.04419.09176.0
2006-12-293174.392196.394346854.6813350.81252.05162.011329.0
2006-12-302796.108312.142346377.1511952.63072.07893.012516.0
2006-12-313494.196150.852345451.0714687.40.0347.06502.0

Power consumption distribution with histogram

Year wise histogram plot of feature Global_active_power

fig, ax = plt.subplots(figsize=(18,18))

for i in range(len(years)):
    plt.subplot(len(years), 1, i+1)
    year = years[i]
    active_power_data = data[str(year)]
    active_power_data['Global_active_power'].hist(bins = 200)
    plt.title(str(year), y = 0, loc = 'left')
plt.show()
fig.tight_layout()

Histogram plot for All Features

fig, ax = plt.subplots(figsize=(18,18))

for i in range(len(data.columns)):
    plt.subplot(len(data.columns), 1, i+1)
    name = data.columns[i]
    data[name].hist(bins=200)
    plt.title(name, y=0, loc = 'right')
    plt.yticks([])
plt.show()
fig.tight_layout()

Plot power consumption hist for each month of 2007

months = [i for i in range(1,13)]

fig, ax = plt.subplots(figsize=(18,18))

for i in range(len(months)):
    ax = plt.subplot(len(months), 1, i+1)
    month = '2007-' + str(months[i])
    active_power_data = dataset[month]
    active_power_data['Global_active_power'].hist(bins = 100)
    ax.set_xlim(0,5)
    plt.title(month, y = 0, loc = 'right')
plt.show()
fig.tight_layout()

Observation :

1. From the above diagram we can say that power consumption in the month of Nov, Dec, Jan, Feb, Mar is more as there is a long tail as compare to other months.

2. It also shows that the during the winter seasons, the heating systems are used and not in summer.

3. The above graph is highly concentrated on 0.3W and 1.3W.

Active Power Uses Prediction

What can we predict
  • Forecast hourly consumption for the next day.
  • Forecast daily consumption for the next week.
  • Forecast daily consumption for the next month.
  • Forecast monthly consumption for the next year.
Modeling Methods

There are many modeling methods and few of those are as follows

  • Naive Methods -> Naive methods would include methods that make very simple, but often very effective assumptions.
  • Classical Linear Methods -> Classical linear methods include techniques are very effective for univariate time series forecasting
  • Machine Learning Methods -> Machine learning methods require that the problem be framed as a supervised learning problem.
    • K-nearest neighbors.
    • SVM
    • Decision trees
    • Random forest
    • Gradient boosting machines
  • Deep Learning Methods -> combinations of CNN LSTM and ConvLSTM, have proven effective on time series classification tasks
    • CNN
    • LSTM
    • CNN - LSTM

Problem Framing:

Given recent power consumption, what is the expected power consumption for the week ahead?

This requires that a predictive model forecast the total active power for each day over the next seven days

A model of this type could be helpful within the household in planning expenditures. It could also be helpful on the supply side for planning electricity demand for a specific household.

Input -> Predict

[Week1] -> Week2

[Week2] -> Week3

[Week3] -> Week4

#top rows

data.head()
Global_active_powerGlobal_reactive_powerVoltageGlobal_intensitySub_metering_1Sub_metering_2Sub_metering_3
date_time
2006-12-161209.17634.92293552.535180.80.0546.04926.0
2006-12-173390.460226.006345725.3214398.62033.04187.013341.0
2006-12-182203.826161.792347373.649247.21063.02621.014018.0
2006-12-191666.194150.942348479.017094.0839.07602.06197.0
2006-12-202225.748160.998348923.619313.00.02648.014063.0
#printing last rows

data.tail()
Global_active_powerGlobal_reactive_powerVoltageGlobal_intensitySub_metering_1Sub_metering_2Sub_metering_3
date_time
2010-12-071109.574285.912345914.854892.01724.0646.06444.0
2010-12-08529.698169.098346744.702338.20.0514.03982.0
2010-12-091612.092201.358347932.406848.21805.02080.08891.0
2010-12-101579.692170.268345975.376741.21104.0780.09812.0
2010-12-111836.822151.144343926.577826.22054.0489.010308.0
#here are splitting the dataset
#dataset upto end of 2009 is in train dataset and remaining we keeping it in test dataset

data_train = data.loc[:'2009-12-31', :]['Global_active_power']
data_train.head()
date_time
2006-12-16    1209.176
2006-12-17    3390.460
2006-12-18    2203.826
2006-12-19    1666.194
2006-12-20    2225.748
Freq: D, Name: Global_active_power, dtype: float64
data_test = data['2010']['Global_active_power']
data_test.head()
date_time
2010-01-01    1224.252
2010-01-02    1693.778
2010-01-03    1298.728
2010-01-04    1687.440
2010-01-05    1320.158
Freq: D, Name: Global_active_power, dtype: float64
data_train.shape
(1112,)
data_test.shape
(345,)
Observation :
  1. We have 1112 datapoints in train dataset and 345 datapoints in test dataset

Prepare training data

#training data

data_train.head(14)
date_time
2006-12-16    1209.176
2006-12-17    3390.460
2006-12-18    2203.826
2006-12-19    1666.194
2006-12-20    2225.748
2006-12-21    1723.288
2006-12-22    2341.338
2006-12-23    4773.386
2006-12-24    2550.012
2006-12-25    2743.120
2006-12-26    3934.110
2006-12-27    1528.760
2006-12-28    2072.638
2006-12-29    3174.392
Freq: D, Name: Global_active_power, dtype: float64
#converting the data into numpy array

data_train = np.array(data_train)
#we are splitting the data weekly wise(7days)

X_train, y_train = [], []

for i in range(7, len(data_train)-7):
    X_train.append(data_train[i-7:i])
    y_train.append(data_train[i:i+7])
#converting list to numpy array

X_train, y_train = np.array(X_train), np.array(y_train)
#shape of train and test dataset

X_train.shape, y_train.shape
((1098, 7), (1098, 7))
#printing the ytrain value

pd.DataFrame(y_train).head()
0123456
04773.3862550.0122743.1203934.1101528.7602072.6383174.392
12550.0122743.1203934.1101528.7602072.6383174.3922796.108
22743.1203934.1101528.7602072.6383174.3922796.1083494.196
33934.1101528.7602072.6383174.3922796.1083494.1962749.004
41528.7602072.6383174.3922796.1083494.1962749.0041824.760
#Normalising the dataset between 0 and 1

x_scaler = MinMaxScaler()
X_train = x_scaler.fit_transform(X_train)
#Normalising the dataset

y_scaler = MinMaxScaler()
y_train = y_scaler.fit_transform(y_train)
pd.DataFrame(X_train).head()
0123456
00.2119960.6942520.4319010.3130370.4367480.3256600.462304
10.6942520.4319010.3130370.4367480.3256600.4623041.000000
20.4319010.3130370.4367480.3256600.4623041.0000000.508439
30.3130370.4367480.3256600.4623041.0000000.5084390.551133
40.4367480.3256600.4623041.0000000.5084390.5511330.814446
#converting to 3 dimension

X_train = X_train.reshape(1098, 7, 1)
X_train.shape
(1098, 7, 1)

Build LSTM Model

#building sequential model using Keras

reg = Sequential()
reg.add(LSTM(units = 200, activation = 'relu', input_shape=(7,1)))
reg.add(Dense(7))
#here we have considered loss as mean square error and optimizer as adam

reg.compile(loss='mse', optimizer='adam')
#training the model

reg.fit(X_train, y_train, epochs = 100)
Train on 1098 samples
Epoch 1/100
1098/1098 [==============================] - 2s 2ms/sample - loss: 0.0626
Epoch 2/100
1098/1098 [==============================] - 0s 296us/sample - 
.
.
.
.
.
Epoch 99/100
1098/1098 [==============================] - 0s 270us/sample - loss: 0.0228
Epoch 100/100
1098/1098 [==============================] - 0s 269us/sample - loss: 0.0228
<tensorflow.python.keras.callbacks.History at 0x19ba56fc668>

Observation:

  1. We have done with training and loss which we have got is 0.0232

Prepare test dataset and test LSTM model

#testing dataset

data_test = np.array(data_test)
#here we are splitting the data weekly wise(7days)

X_test, y_test = [], []

for i in range(7, len(data_test)-7):
    X_test.append(data_test[i-7:i])
    y_test.append(data_test[i:i+7])
X_test, y_test = np.array(X_test), np.array(y_test)
X_test = x_scaler.transform(X_test)
y_test = y_scaler.transform(y_test)
#converting to 3 dimension

X_test = X_test.reshape(331,7,1)
X_test.shape
(331, 7, 1)
y_pred = reg.predict(X_test)
#bringing y_pred values to their original forms by using inverse transform

y_pred = y_scaler.inverse_transform(y_pred)
y_pred
array([[1508.9413 , 1476.1537 , 1487.5676 , ..., 1484.8464 , 1459.3864 ,
        1551.5675 ],
       [1158.2788 , 1287.0326 , 1346.428  , ..., 1430.5685 , 1420.6346 ,
        1472.5759 ],
       [1571.7665 , 1507.0337 , 1516.5574 , ..., 1432.5813 , 1393.9161 ,
        1504.1714 ],
       ...,
       [ 952.85785,  852.4236 ,  933.62585, ...,  800.12006,  831.2844 ,
        1005.20844],
       [1579.4896 , 1353.6078 , 1278.9501 , ...,  981.4198 ,  967.6466 ,
        1146.7898 ],
       [1629.0509 , 1392.7751 , 1288.7218 , ..., 1052.977  , 1070.8586 ,
        1243.1346 ]], dtype=float32)
y_true = y_scaler.inverse_transform(y_test)
y_true
array([[ 555.664, 1593.318, 1504.82 , ...,    0.   , 1995.796, 2116.224],
       [1593.318, 1504.82 , 1383.18 , ..., 1995.796, 2116.224, 2196.76 ],
       [1504.82 , 1383.18 ,    0.   , ..., 2116.224, 2196.76 , 2150.112],
       ...,
       [1892.998, 1645.424, 1439.426, ..., 1973.382, 1109.574,  529.698],
       [1645.424, 1439.426, 2035.418, ..., 1109.574,  529.698, 1612.092],
       [1439.426, 2035.418, 1973.382, ...,  529.698, 1612.092, 1579.692]])

Evaluate the model

Here, we using metric as mean square error since it is a regression problem

def evaluate_model(y_true, y_predicted):
    scores = []
    
    #calculate scores for each day
    for i in range(y_true.shape[1]):
        mse = mean_squared_error(y_true[:, i], y_predicted[:, i])
        rmse = np.sqrt(mse)
        scores.append(rmse)
    
    #calculate score for whole prediction
    total_score = 0
    for row in range(y_true.shape[0]):
        for col in range(y_predicted.shape[1]):
            total_score = total_score + (y_true[row, col] - y_predicted[row, col])**2
    total_score = np.sqrt(total_score/(y_true.shape[0]*y_predicted.shape[1]))
    
    return total_score, scores
evaluate_model(y_true, y_pred)
(579.2827596682928,
 [598.0411885086157,
  592.5770673397814,
  576.1153945912635,
  563.9396525162248,
  576.5479538079353,
  570.7699415990154,
  576.2430188855649])
#standard deviation

np.std(y_true[0])
710.0253857243853

Conclusions:

  1. From the above experiment, we have got root mean square error around 598 watts.
  2. In order to check whether our model is performing good or bad, we need to evaluate standard deviation which we have got here as 710 watts.
  3. Here mean square error is less than standard deviation. Hence, we can say that our model is performing good.