Regression models based on recurrent neural networks (RNN) can recognize patterns in time series data, making them an exciting technology for stock market forecasting. What distinguishes these RNNs from traditional neural networks is their architecture. It consists of multiple layers of long-term, short-term memory (LSTM). These LSTM layers allow the model to learn patterns in a time series that occur over different periods and are often difficult for human analysts to detect. We can train such models with one feature (univariate forecasting models) or multiple features (multivariate models). Multivariate Models can take more data into account, and if we provide them with relevant features, they can make better predictions. This tutorial uses Python and Keras to implement a multivariate RNN for stock price prediction. We define the architecture of our regression model and then train this model to predict the NASDAQ index.

The remainder of this tutorial proceeds in two parts: We start with a brief intro in which we compare modeling univariate and multivariate time series data. Then we turn to the hands-on part, in which we prepare the multivariate time series data and use it to train a neural network in Python. The model is a recurrent neural network with LSTM layers that forecasts the NASDAQ stock market index. Finally, we evaluate the performance of our model and make a forecast for the next day.

## Univariate vs. Multivariate Time Series Models

Multivariate models and univariate models differ in the number of their input features. While univariate models consider only a single feature, multivariate models use several input variables (features). In stock market forecasting, we can create additional features from price history. Examples are performance indicators such as moving averages, the RSI, or the Sales Volume. We can also include features from other sources, for example, social media sentiment, weather forecasts, etc. Multivariate models that have additional relevant information available have a chance to outperform univariate models. However, this is only true if the features are relevant and are indicative of future price movements.

Preparing data for training univariate models is easier than for multivariate models. If you are new to time series prediction, you might want to look at my earlier articles. These explain how to develop and evaluate univariate time series models:

- Stock Market Forecasting using Univariate Models and Python
- Multi-step Time Series Forecasting with Python: Step-by-Step Guide
- Stock Market Prediction – Adjusting Time Series Prediction Intervals
- Evaluating Time Series Forecasting Models with Python

#### Univariate Prediction Models

The standard approach in time series regression is to train a model on past values from the time series the model seeks to predict. The idea is that the value of a time series at time t is closely related to the previous time steps t-1, t-2, t-3, etc. This approach is similar to chart analysis, aiming to identify recurring formations in a price chart that indicates future movements. In both cases, the prediction performance depends on the capacity to identify recurring price formations and draw the correct conclusions.

#### Multivariate Prediction Models

Forecasting the price of a financial asset is a complex task. An endless number of variables can influence the price. Economic cycles, political developments, unforeseen events, psychological factors, market sentiment, and even the weather, all these variables will more or less exert an influence on the price. In addition, many of these variables are interdependent, which makes statistical modeling even more complex. Multivariate models are not able to fully cover the complexity of the market. However, they offer a more detailed abstraction of reality than univariate models.

A univariable forecast model reduces this complexity to a minimum of a single dependent variable. The other dimensions are left out. A multivariate can take several factors into account, but it is still a simplification. For example, a multivariate stock market prediction model can consider the relationship between the closing price and the opening price, moving averages, daily highs, the price of other stocks, and so on.

And even if you have great features, it gets more complex, as patterns and market rules may be subjected to frequent change. Models thus inevitably make mistakes. Nevertheless, to quote Georg Box, “All models are wrong, but some are useful.”

## Implementing a Multivariate Time Series Prediction Model in Python

Now that we know the basics of multivariate time series forecasting, it’s time to bring our knowledge into practice. In the following, we will use Python and Tensorflow to develop a multivariate recurrent neuronal network for time series prediction. The model will forecast the NASDAQ stock market index.

The development process covers six essential steps:

- Creating Features and Scaling the Data
- Splitting the Data into Train and Test
- Slicing the Data using a sliding window approach
- Training the model
- Model Validation
- Making Predictions and Unscaling them

The code is available on the GitHub repository.

### Prerequisites

Before starting the coding part, make sure that you have set up your Python 3 environment and required packages. If you don’t have a Python environment, you can follow the steps in this tutorial to set up the Anaconda environment.

Also, make sure you install all required packages. In this tutorial, we will be working with the following standard packages:

In addition, we will be using *Keras *(2.0 or higher) with *Tensorflow* backend, the machine learning library sci-kit-learn, and the pandas-DataReader.

You can install packages using console commands:

*pip install <package name>**conda install <package name>*(if you are using the anaconda packet manager)

### Step #1 Load the Time Series Data

Let’s start by loading price data on the NASDAQ composite index **(symbol: ^IXIC)** from yahoo.finance.com into our Python project. To download the data, we use Pandas DataReader – a popular Python library that provides functions to extract data from various sources on the web. Alternatively, you can also use the “yfinance” library.

We provide the technical symbol for the NASDAQ index, “^IXIC.” Alternatively, you could use other asset symbols, for example, BTC-USD, to get price quotes for Bitcoin. In addition, we limit the data in the API request to the timeframe between 2010-01-01 and the current date.

Running the code below will load the data into a new DataFrame object. Be aware that input data and predictions will vary depending on when you execute the code.

# Time Series Forecasting - Multivariate Time Series Models for Stock Market Prediction import math # Mathematical functions import numpy as np # Fundamental package for scientific computing with Python import pandas as pd # Additional functions for analysing and manipulating data from datetime import date, timedelta, datetime # Date Functions from pandas.plotting import register_matplotlib_converters # This function adds plotting functions for calender dates import matplotlib.pyplot as plt # Important package for visualization - we use this to plot the market data import matplotlib.dates as mdates # Formatting dates import tensorflow as tf from sklearn.metrics import mean_absolute_error, mean_squared_error # Packages for measuring model performance / errors from tensorflow.keras import Sequential # Deep learning library, used for neural networks from tensorflow.keras.layers import LSTM, Dense, Dropout # Deep learning classes for recurrent and regular densely-connected layers from tensorflow.keras.callbacks import EarlyStopping # EarlyStopping during model training from sklearn.preprocessing import RobustScaler, MinMaxScaler # This Scaler removes the median and scales the data according to the quantile range to normalize the price data import seaborn as sns # Visualization sns.set_style('white', { 'axes.spines.right': False, 'axes.spines.top': False}) # check the tensorflow version and the number of available GPUs print('Tensorflow Version: ' + tf.__version__) physical_devices = tf.config.list_physical_devices('GPU') print("Num GPUs:", len(physical_devices)) # Setting the timeframe for the data extraction end_date = date.today().strftime("%Y-%m-%d") start_date = '2010-01-01' # Getting NASDAQ quotes stockname = 'NASDAQ' symbol = '^IXIC' # You can either use webreader or yfinance to load the data from yahoo finance # import pandas_datareader as webreader # df = webreader.DataReader(symbol, start=start_date, end=end_date, data_source="yahoo") import yfinance as yf #Alternative package if webreader does not work: pip install yfinance df = yf.download(symbol, start=start_date, end=end_date) # Create a quick overview of the dataset df.head()

The data looks as expected and has the following columns:

- High – the daily high
- Low – the daily low
- Open – the opening price
- Close – the closing price
- Volume – the daily trading volume
- Adj Close – the adjacent closing price

### Step #2 Explore the Data

Let’s first familiarize ourselves with the data before processing them further. Line plots are an excellent choice to gain a quick overview of time series data. By running the code below, we create lineplots for all columns in our DataFrame.

# Plot line charts df_plot = df.copy() ncols = 2 nrows = int(round(df_plot.shape[1] / ncols, 0)) fig, ax = plt.subplots(nrows=nrows, ncols=ncols, sharex=True, figsize=(14, 7)) for i, ax in enumerate(fig.axes): sns.lineplot(data = df_plot.iloc[:, i], ax=ax) ax.tick_params(axis="x", rotation=30, labelsize=10, length=0) ax.xaxis.set_major_locator(mdates.AutoDateLocator()) fig.tight_layout() plt.show()

The line plots look as expected. We continue with preprocessing and feature engineering.

### Step #3 Feature Selection and Scaling

Before we can train the neural network, we need to transform the data into a processable shape. In this section, we perform the following tasks:

- Selecting features
- Scaling the data to a standard value range

#### 3.1 Selecting Features

First, we will select the features upon which we want to train our neural network. The selection and engineering of relevant feature variables is a complex topic. We could also create additional features such as moving averages, but I want to keep things simple. Therefore, we select features that are already present in our data. To learn more about feature engineering for stock market prediction, check out the relataly feature engineering tutorial.

Running the code below selects the features. We add a dummy column to our record called “Predictions,” which will help us later when we need to reverse the scaling of our data.

# Indexing Batches train_df = df.sort_values(by=['Date']).copy() # List of considered Features FEATURES = ['High', 'Low', 'Open', 'Close', 'Volume' #, 'Month', 'Year', 'Adj Close' ] print('FEATURE LIST') print([f for f in FEATURES]) # Create the dataset with features and filter the data to the list of FEATURES data = pd.DataFrame(train_df) data_filtered = data[FEATURES] # We add a prediction column and set dummy values to prepare the data for scaling data_filtered_ext = data_filtered.copy() data_filtered_ext['Prediction'] = data_filtered_ext['Close'] # Print the tail of the dataframe data_filtered_ext.tail()

#### 3.2 Scaling the Multivariate Input Data

Another necessary step in data preparation for neural networks is scaling the input data. Scaling will increase training times and improve model accuracy. The scikit-learn package offers different scaling approaches. We use the MinMaxScaler to scale the input data to a range between 0 and 1.

A model that is trained on scaled data will also produce scaled predictions. Therefore, when we make predictions later with our model, we must not forget to scale the predictions back. The scaler_model will adapt to the shape of the data (6 dimensional). However, our predictions will be one-dimensional. Because the scaler has a fixed input shape, we cannot simply reuse it for unscaling our model predictions. To unscale the predictions later, we create an additional scaler that works on a single feature column (scaler_pred).

# Get the number of rows in the data nrows = data_filtered.shape[0] # Convert the data to numpy values np_data_unscaled = np.array(data_filtered) np_data = np.reshape(np_data_unscaled, (nrows, -1)) print(np_data.shape) # Transform the data by scaling each feature to a range between 0 and 1 scaler = MinMaxScaler() np_data_scaled = scaler.fit_transform(np_data_unscaled) # Creating a separate scaler that works on a single column for scaling predictions scaler_pred = MinMaxScaler() df_Close = pd.DataFrame(data_filtered_ext['Close']) np_Close_scaled = scaler_pred.fit_transform(df_Close)

Out: (2619, 6)

### Step #4 Transforming the Data

Next, we train our multivariate regression model based on a three-dimensional data structure. The first dimension is the sequences, the second dimension is the time steps (mini-batches), and the third dimension is the features. The illustration below shows the steps to bring the multivariate data into a shape our neural model can process during training. We must keep this form and perform the same steps when using the model to create a forecast.

An essential step in the preparation process is slicing the data into multiple input data sequences with associated target values. We write a simple Python script that uses a “sliding window.” This approach moves a window through the time series data, adding a sequence of multiple data points to the input data with each step. The target value (e.g., Closing Price) follows this sequence, and we store it in a separate target dataset. Then we push the window one step further and repeat these activities. This process results in a data set with many input sequences (mini-batches), each with a corresponding target value in the target record. This process applies both to the training and the test data.

We will apply the sliding window approach to our data. The result is a training set (x_train) containing 2258 input sequences, each with 50 steps and six features. The related target dataset (y_train) has 2258 target values.

# Set the sequence length - this is the timeframe used to make a single prediction sequence_length = 50 # Prediction Index index_Close = data.columns.get_loc("Close") # Split the training data into train and train data sets # As a first step, we get the number of rows to train the model on 80% of the data train_data_len = math.ceil(np_data_scaled.shape[0] * 0.8) # Create the training and test data train_data = np_data_scaled[0:train_data_len, :] test_data = np_data_scaled[train_data_len - sequence_length:, :] # The RNN needs data with the format of [samples, time steps, features] # Here, we create N samples, sequence_length time steps per sample, and 6 features def partition_dataset(sequence_length, data): x, y = [], [] data_len = data.shape[0] for i in range(sequence_length, data_len): x.append(data[i-sequence_length:i,:]) #contains sequence_length values 0-sequence_length * columsn y.append(data[i, index_Close]) #contains the prediction values for validation, for single-step prediction # Convert the x and y to numpy arrays x = np.array(x) y = np.array(y) return x, y # Generate training data and test data x_train, y_train = partition_dataset(sequence_length, train_data) x_test, y_test = partition_dataset(sequence_length, test_data) # Print the shapes: the result is: (rows, training_sequence, features) (prediction value, ) print(x_train.shape, y_train.shape) print(x_test.shape, y_test.shape) # Validate that the prediction value and the input match up # The last close price of the second input sample should equal the first prediction value print(x_train[1][sequence_length-1][index_Close]) print(y_train[0])

### Step #5 Train the Multivariate Prediction Model

Once we have the data prepared and ready, we can train our model. The architecture of our neural network consists of the following four layers:

- An LSTM layer, which takes our mini-batches as input and returns the whole sequence
- Another LSTM layer that takes the sequence from the previous layer but only returns five values
- Dense layer with five neurons
- A final dense layer that outputs the predicted value

The number of neurons in the first layer must equal the size of a minibatch of the input data. Each minibatch in our dataset consists of a matrix with 50 steps and six features. Thus, the input layer of our recurrent neural network consists of 300 neurons. Keeping this architecture in mind is essential because, later, we need to bring the data into the same shape when we want to predict a new dataset. Running the code below creates the model architecture and compiles the model.

# Configure the neural network model model = Sequential() # Model with n_neurons = inputshape Timestamps, each with x_train.shape[2] variables n_neurons = x_train.shape[1] * x_train.shape[2] print(n_neurons, x_train.shape[1], x_train.shape[2]) model.add(LSTM(n_neurons, return_sequences=True, input_shape=(x_train.shape[1], x_train.shape[2]))) model.add(LSTM(n_neurons, return_sequences=False)) model.add(Dense(5)) model.add(Dense(1)) # Compile the model model.compile(optimizer='adam', loss='mse')

Running the code below starts the training process.

# Training the model epochs = 50 batch_size = 16 early_stop = EarlyStopping(monitor='loss', patience=5, verbose=1) history = model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_data=(x_test, y_test) ) #callbacks=[early_stop])

Let’s take a quick look at the loss curve.

# Plot training & validation loss values fig, ax = plt.subplots(figsize=(16, 5), sharex=True) sns.lineplot(data=history.history["loss"]) plt.title("Model loss") plt.ylabel("Loss") plt.xlabel("Epoch") ax.xaxis.set_major_locator(plt.MaxNLocator(epochs)) plt.legend(["Train", "Test"], loc="upper left") plt.grid() plt.show()

The loss drops quickly to a lower plateau, which signals that the model has improved throughout the training process.

### Step #6 Evaluate Model Performance

Once we have trained the neural network regression model, we want to measure its performance. As mentioned in section 3, we first have to reverse the scaling of the predictions. Afterward, we calculate different error metrics, MAE, MAPE, and MDAPE. Then we will compare the predictions in a line plot with the actual values. For more information on measuring the performance of regression models, see this relataly article.

# Get the predicted values y_pred_scaled = model.predict(x_test) # Unscale the predicted values y_pred = scaler_pred.inverse_transform(y_pred_scaled) y_test_unscaled = scaler_pred.inverse_transform(y_test.reshape(-1, 1)) # Mean Absolute Error (MAE) MAE = mean_absolute_error(y_test_unscaled, y_pred) print(f'Median Absolute Error (MAE): {np.round(MAE, 2)}') # Mean Absolute Percentage Error (MAPE) MAPE = np.mean((np.abs(np.subtract(y_test_unscaled, y_pred)/ y_test_unscaled))) * 100 print(f'Mean Absolute Percentage Error (MAPE): {np.round(MAPE, 2)} %') # Median Absolute Percentage Error (MDAPE) MDAPE = np.median((np.abs(np.subtract(y_test_unscaled, y_pred)/ y_test_unscaled)) ) * 100 print(f'Median Absolute Percentage Error (MDAPE): {np.round(MDAPE, 2)} %')

The MAPE is 22.15, which means that the mean of our predictions deviates from the actual values by 3.12%. The MDAPE is 2.88 % and a bit lower than the mean, thus indicating there are some outliers among the prediction errors. 50% of the predictions deviate by more than 2.88%, and 50% differ by less than 2.88% from the actual values.

Next, we create a line plot showing the forecast and compare it to the actual values. Adding a bar plot to the chart helps highlight the deviations of the predictions from the actual values. Running the code below creates the line plot.

# The date from which on the date is displayed display_start_date = "2019-01-01" # Add the difference between the valid and predicted prices train = pd.DataFrame(data_filtered_ext['Close'][:train_data_len + 1]).rename(columns={'Close': 'y_train'}) valid = pd.DataFrame(data_filtered_ext['Close'][train_data_len:]).rename(columns={'Close': 'y_test'}) valid.insert(1, "y_pred", y_pred, True) valid.insert(1, "residuals", valid["y_pred"] - valid["y_test"], True) df_union = pd.concat([train, valid]) # Zoom in to a closer timeframe df_union_zoom = df_union[df_union.index > display_start_date] # Create the lineplot fig, ax1 = plt.subplots(figsize=(16, 8)) plt.title("y_pred vs y_test") plt.ylabel(stockname, fontsize=18) sns.set_palette(["#090364", "#1960EF", "#EF5919"]) sns.lineplot(data=df_union_zoom[['y_pred', 'y_train', 'y_test']], linewidth=1.0, dashes=False, ax=ax1) # Create the bar plot with the differences df_sub = ["#2BC97A" if x > 0 else "#C92B2B" for x in df_union_zoom["residuals"].dropna()] ax1.bar(height=df_union_zoom['residuals'].dropna(), x=df_union_zoom['residuals'].dropna().index, width=3, label='residuals', color=df_sub) plt.legend() plt.show()

The line plot shows that the forecast is close to the actual values but partially deviates from it. The deviations between actual values and predictions are called residuals. For our mode, they seem to be most significant during periods of increased market volatility and least during periods of steady market movement, which makes sense because sudden movements are generally more difficult to predict.

### Step #7 Predict the Next Day’s Price

After training the neural network, we want to forecast the stock market for the next day. For this purpose, we extract a new dataset from the Yahoo-Finance API and preprocess it as we did for model training.

We trained our model with mini-batches of 50 time steps and six features. Thus, we must also provide the model with 50-time steps when making the forecast. As before, we transform the data into the shape of 1 x 50 x 6, whereby the last figure is the number of feature columns. After generating the forecast, we unscale the stock market predictions back to the original range of values.

df_temp = df[-sequence_length:] new_df = df_temp.filter(FEATURES) N = sequence_length # Get the last N day closing price values and scale the data to be values between 0 and 1 last_N_days = new_df[-sequence_length:].values last_N_days_scaled = scaler.transform(last_N_days) # Create an empty list and Append past N days X_test_new = [] X_test_new.append(last_N_days_scaled) # Convert the X_test data set to a numpy array and reshape the data pred_price_scaled = model.predict(np.array(X_test_new)) pred_price_unscaled = scaler_pred.inverse_transform(pred_price_scaled.reshape(-1, 1)) # Print last price and predicted price for the next day price_today = np.round(new_df['Close'][-1], 2) predicted_price = np.round(pred_price_unscaled.ravel()[0], 2) change_percent = np.round(100 - (price_today * 100)/predicted_price, 2) plus = '+'; minus = '' print(f'The close price for {stockname} at {end_date} was {price_today}') print(f'The predicted close price is {predicted_price} ({plus if change_percent > 0 else minus}{change_percent}%)')

The close price for NASDAQ on 2021-06-27 was 14360.39. The predicted closing price is 14232.8095703125 (-0.9%)

## Summary

This tutorial has shown multivariate time series modeling for stock market prediction in Python. We trained a neural network regression model for predicting the NASDAQ index. Before training our model, we performed several steps to prepare the data. The steps included splitting the data and scaling them. In addition, we created and tested various new features from the original time series data to account for the multivariate modeling approach. You now have the knowledge and code to conduct further experiments with the features of your choice.

Multivariate time series forecasting is a complex topic. You might want to take the time to retrace the different steps. Especially the transformation of the data can be challenging. The best way to learn is to practice. Therefore I encourage you to develop more time series models and experiment with other data sources.

I am always trying to learn and improve. If you want to give feedback or have remarks, feel free to share them in the comments.

## Sources and Further Reading

- Charu C. Aggarwal (2018) Neural Networks and Deep Learning
- Jansen (2020) Machine Learning for Algorithmic Trading: Predictive models to extract signals from market and alternative data for systematic trading strategies with Python
- Aurélien Géron (2019) Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems
- David Forsyth (2019) Applied Machine Learning Springer
- Andriy Burkov (2020) Machine Learning Engineering

*The links above to Amazon are affiliate links. By buying through these links, you support the Relataly.com blog and help to cover the hosting costs. Using the links does not affect the price.*

Another interesting approach to stock market prediction uses candlestick images and convolutional neural networks. If this topic interests you, check out the following article: Deep reinforcement learning stock market trading, utilizing a CNN with candlestick images

how it is predicting for next ,you are taking data till same day and predicting for the same day ? can you explain? Thanks in advance…

My error scores are

Median Absolute Error (MAE): 76.68

Mean Absolute Percentage Error (MAPE): 1.43 %

Median Absolute Percentage Error (MDAPE): 1.11 %

are they okay? or is it too good? I just copy pasted your code…

Please I need help and I have very limited time. My prediction is flat, all figures are the same. I am unable to pin point where the error is, can anyone help me? Thank you.

Well done – informative and easy to understand.

thank you

I think your x_train including the target column “Close” isn’t it?

Hello Iqmal, yes, that is correct. The code example forecasts the closing price of the next day. Therefore, it is ok to use the previous day’s closing price as a feature.

If you do not want to use the closing price, you can remove it from the list of features in step 3.

Hello,

I basically copy/pasted your code and ran the model. I was not able to get anything below 6% MAE Error rate and went even as high as 14%. Any advice on how to optimize the model?

THanks!

Hi, Florian, great post! This has really helped me.

I have 2 questions regarding the train_data and test_data, and inclusion of a validation dataset.

1. With separating the data into train_data and test_data with the code:

train_data = np_data_scaled[0:train_data_len, :]

test_data = np_data_scaled[train_data_len – sequence_length:, :]

Does this create data leakage as the test_data will include data from the train set, or is this prevented due to the use of a sliding window?

2. With respect to question 1, how can I include a validation set for hyperparameter tuning?

Many thanks!

https://www.relataly.com/time-series-forecasting-changing-prediction-horizon/169/

gives the error:

Not Found

Sorry, but the page you were trying to view does not exist.

It looks like this was the result of either:

a mistyped address

an out-of-date link

Can you update the reference link? Thanks!

how it is predicting for next ,you are taking data till same day and predicting for the same day.

how i can predict daywise for next 15 days?

Check out this article:

https://www.relataly.com/time-series-forecasting-changing-prediction-horizon/169/

Hi Mr. Muller, the one that you provide is a prediction for the next 7 days. How about I want to predict price at t+1, t+2, t+3, t+4, t+5, t+6 and t+7,…for 15 days? where t=today. Thank you