# Airline Passenger Prediction using RNN – LSTM

## Prediction of number of passengers for an airline using LSTM

In this project we are going to build a model to predict the number of passengers in an airline. To do so we are going to use `Recurrent Neural Networks`, more precisely `Long Short Term Memory`.

### Recurrent Neural Network

• Neural Networks are set of algorithms which closely resembles the human brain and are designed to recognize patterns.
• Recurrent Neural Network is a generalization of feedforward neural network that has an internal memory.
• RNN is recurrent in nature as it performs the same function for every input of data while the output of the current input depends on the past one computation.
• After producing the output, it is copied and sent back into the recurrent network. For making a decision, it considers the current input and the output that it has learned from the previous input.
• In other neural networks, all the inputs are independent of each other. But in RNN, all the inputs are related to each other.

### Long Short Term Memory

• Long Short-Term Memory (LSTM) networks are a modified version of recurrent neural networks, which makes it easier to remember past data in memory.
• Generally LSTM is composed of a cell (the memory part of the LSTM unit) and three “regulators”, usually called gates, of the flow of information inside the LSTM unit: an input gate, an output gate and a forget gate.
• Intuitively, the cell is responsible for keeping track of the dependencies between the elements in the input sequence.
• The input gate controls the extent to which a new value flows into the cell, the forget gate controls the extent to which a value remains in the cell and the output gate controls the extent to which the value in the cell is used to compute the output activation of the LSTM unit.
• The activation function of the LSTM gates is often the logistic sigmoid function.
• There are connections into and out of the LSTM gates, a few of which are recurrent. The weights of these connections, which need to be learned during training, determine how the gates operate.

### Dataset

This dataset provides monthly totals of a US airline passengers from 1949 to 1960. The dataset has 2 columns `month` and `passengers``month` contains the month of the year and `passengers` contains total number of passengers travelled on that particular month.

We are going to use `tensorflow` to build the LSTM. You can install `tensorflow` by running this command. If you machine has a GPU you can use the second command.

!pip install tensorflow

!pip install tensorflow-gpu

The necessary python libraries are imported here-

• `numpy` is used to perform basic array operations
• `pyplot` from matplotlib is used to visualize the results
• `pandas` for loading and manipulating the data.
• `Tensorflow` is used to build the neural network
• We have even imported all the layers required to build the model from `keras`.
```import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM
from sklearn.preprocessing import MinMaxScaler
```

Now we will read the dataset using `read.csv()`. We will only retain the `passengers` column from the dataset and `reshape` it by converting it into a `numpy array`.

```dataset = pd.read_csv('AirPassengers.csv')
dataset = dataset['#Passengers']
dataset = np.array(dataset).reshape(-1,1)
dataset[:10]
```
```array([,
,
,
,
,
,
,
,
,
], dtype=int64)```

Now we will plot the dataset. We can observe that the number of passengers has increased linearly.

```plt.plot(dataset)
```

Neural networks work better if inputs are between 0 and 1. So we are going to scale down the inputs using `MinMaxScaler()`. We can see that after scaling the minimum value is 0 and maximum value is 1.

```scaler = MinMaxScaler()
dataset = scaler.fit_transform(dataset)
dataset.min(),dataset.max()
```
`(0.0, 1.0)`

We are going to use the data of first `100` months as `training data` and the last `44` months as `testing data`.

```train_size = 100
test_size = 44
train = dataset[0:train_size, :]
train.shape
```
`(100, 1)`
```test = dataset[train_size:144, :]
test.shape
```
`(44, 1)`

### Create training and testing dataset

We are going to predict the (i)th value in the dataset on the basis of (i-1)th value. That means we are going to look back by 1 to predict the next value. Hence we are creating a function `get_data()` to create `dataX` and `dataY` for the `training` as well as the `testing` data.

```def get_data(dataset, look_back):
dataX, dataY = [], []
for i in range(len(dataset)-look_back-1):
a = dataset[i:(i+look_back), 0]
dataX.append(a)
dataY.append(dataset[i+look_back, 0])
return np.array(dataX), np.array(dataY)

look_back = 1
X_train, y_train = get_data(train, look_back)
X_train[:10]
```
```array([[0.01544402],
[0.02702703],
[0.05405405],
[0.04826255],
[0.03281853],
[0.05984556],
[0.08494208],
[0.08494208],
[0.06177606],
[0.02895753]])```
```y_train[:10]
```
```array([0.02702703, 0.05405405, 0.04826255, 0.03281853, 0.05984556,
0.08494208, 0.08494208, 0.06177606, 0.02895753, 0.        ])```

Now we have called `get_data()` to create the `testing data`.

```X_test, y_test = get_data(test, look_back)
```

Now we are going to reshape our data and make it 2 dimensional using `reshape()`.

```X_train = X_train.reshape(X_train.shape, X_train.shape, 1)
X_test = X_test.reshape(X_test.shape, X_test.shape, 1)
```
```X_train.shape
```
`(98, 1, 1)`

### Build the model

Our sequential model has 2 layers

#### LSTM layer:

This is the main layer of the model and has `5 units`. It learns long-term dependencies between time steps in time series and sequence data. `input_shape` contains the shape of input which we have to pass as a parameter to the first layer of our neural network.

#### Dense layer:

Dense layer is the regular deeply connected neural network layer. It is most common and frequently used layer. We have number of `units` as 1 because we are going to get a single value as the output.

```model = Sequential()
model.compile(loss = 'mean_squared_error', optimizer = 'adam')
```

We have compliled the model. We can see the summary using `model.summary()`.

```model.summary()
```
```Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
lstm (LSTM)                  (None, 5)                 140
_________________________________________________________________
dense (Dense)                (None, 1)                 6
=================================================================
Total params: 146
Trainable params: 146
Non-trainable params: 0
_________________________________________________________________
```

After compiling the model we will now train the model using `model.fit()` on the training dataset. We will use 50 `epochs` to train the model. An epoch is an iteration over the entire x and y data provided. `batch_size` is the number of samples per gradient update i.e. the weights will be updates after every training example.

```model.fit(X_train, y_train, epochs=50, batch_size=1)

Epoch 45/50
98/98 [==============================] - 0s 2ms/sample - loss: 0.0022
Epoch 46/50
98/98 [==============================] - 0s 2ms/sample - loss: 0.0021
Epoch 47/50
98/98 [==============================] - 0s 2ms/sample - loss: 0.0021
Epoch 48/50
98/98 [==============================] - 0s 2ms/sample - loss: 0.0021
Epoch 49/50
98/98 [==============================] - 0s 2ms/sample - loss: 0.0022
Epoch 50/50
98/98 [==============================] - 0s 2ms/sample - loss: 0.0021
```

Now we will test our model using `X_test`.

```y_pred = model.predict(X_test)
```

This is the `scaler` value which we had used earlier.

```scaler.scale_
```
`array([0.0019305])`

We had scaled down the values in our dataset before passing it to the neural network. Now we will have to get the original values back. For this we will use `scaler.inverse_transform()`.

```y_pred = scaler.inverse_transform(y_pred)
y_test = np.array(y_test)
y_test = y_test.reshape(-1, 1)
y_test = scaler.inverse_transform(y_test)
```

Now we will visualize the result by plotting the real values and the predicted values.

```# plot baseline and predictions
plt.figure(figsize=(14,5))
plt.plot(y_test, label = 'Real number of passengers')
plt.plot(y_pred, label = 'Predicted number of passengers')
plt.ylabel('# passengers')
plt.legend()
plt.show()
```

As we can see that the actual results and the predicted results are following the same trend. Our model is predicting the number of passengers with a good accuracy.

Subscribe
Notify of 