Stock Market Prediction using Neural Networks for Multi-Output Regression in Python

Neural networks can generate various outputs, which is especially useful in time-series forecasting to forecast longer periods. In time series regression, the number of neurons in the final output layer determines how many steps in a time series the model can predict. Models with one output return single-step forecasts. Models with multiple outputs can return entire series of time steps and thus deliver a more detailed projection of how a time series could develop in the future. In this tutorial, we develop a multi-output multi-step neural network for stock price forecasting using Python and Keras.

This article proceeds as follows: We begin by briefly discussing the architecture of a multi-output neural network. After familiarizing the model architecture, we develop a Keras neural network for multi-output regression. For data preparation, we perform various steps, including cleaning, splitting, selecting, and scaling the data. Afterward, we define a model architecture with multiple LSTM layers and ten output neurons in the last layer. This architecture enables the model to generate projections for ten consecutive steps. After configuring the model architecture, we train the model with historical daily prices of the Apple stock. Finally, we use this model to generate a ten-day forecast.

An exemplary architecture of a neural network with five input neurons (blue) and four output neurons (red), keras, python, tutorial, stock market prediction
An exemplary architecture of a neural network with five input neurons (blue) and four output neurons (red)

Disclaimer: This article does not constitute financial advice. Stock markets can be very volatile and are generally difficult to predict. Predictive models and other forms of analytics applied in this article only serve the purpose of illustrating machine learning use cases.

Multi-Output Regression vs. Single-Output Regression

In time series regression, we train a statistical model on the past values of a time series to make statements about how the time series develops further. During model training, we feed the model with so-called mini-batches together with the corresponding target values. The model then creates forecasts for all input batches and compares these predictions to the actual target values to calculate the residuals (prediction errors). In this way, the model can adjust its parameters iteratively and learn to make better predictions.

Next, we will talk about the architecture of a neural network with multiple outputs. This consists of several layers, including an input layer, hidden layers, and an output layer. The number of neurons in the first layer must match the input data, and the number of neurons in the output layer determines the period length of the predictions. Models with a single neuron in the output layer are used to predict a single time step. Predicting multiple steps with a single-output model is possible, but it requires a rolling forecasting approach in which the outputs are iteratively reused to make further-reaching predictions. However, this way is somewhat cumbersome. A more elegant way is to train a multi-output model right away.

A model with multiple neurons in the output layer can predict numerous steps at once per batch. Multi-output regression models are trained with many sequences of subsequent values, followed by the consecutive output sequence. The model architecture thus contains multiple neurons in the initial layer and various neurons in the output layer (as illustrated). In the next part of this tutorial, we will develop a multi-output regression model.

The inputs and outputs of a neural network for time series regression with five input neurons and four outputs

Implementing a Neural Network Model for Multi-Output Multi-Step Regression in Python

Let’s get started with the hands-on part. In the following, we will develop a Keras neural network that forecasts the Apple stock price. We use historical price data that is available via the yahoo finance API. After obtaining the data via the API, we conduct several steps to prepare and split the data. Finally, we train the neural network.

The code is available on the GitHub repository.


Before beginning with the coding part, ensure that you have set up your Python 3 environment and required packages. If you don’t have an environment set up yet, consider the Anaconda Python environment. To set it up, you can follow the steps in this tutorial.

Also, make sure you install all required packages. In this tutorial, we will be working with the following standard packages: 

In addition, we will be using the machine learning libraries Keras, Scikit-learn, and Tensorflow. For visualization, we will be using the Seaborn package.

Please also have either the pandas_datareader or the yfinance package installed. You will use one of these packages to retrieve the historical stock quotes.

You can install these packages using console commands:

  • pip install <package name>
  • conda install <package name> (if you are using the anaconda packet manager)

Step #1: Load the Data

We begin by loading historical price quotes of the Apple stock from the public yahoo finance API. Our first choice for interacting with the yahoo finance API is the Pandas DataReader library. If the library causes a problem (it sometimes does), you can also use the yfinance package, which should return the same data. Running the code below will load the data into a Pandas DataFrame.

# import pandas_datareader as webreader # Remote data access for pandas
import math # Mathematical functions 
import numpy as np # Fundamental package for scientific computing with Python
import pandas as pd # Additional functions for analysing and manipulating data
from datetime import date, timedelta, datetime # Date Functions
from pandas.plotting import register_matplotlib_converters # This function adds plotting functions for calender dates
import matplotlib.pyplot as plt # Important package for visualization - we use this to plot the market data
import matplotlib.dates as mdates # Formatting dates
from sklearn.metrics import mean_absolute_error, mean_squared_error # Packages for measuring model performance / errors
from keras.models import Sequential # Deep learning library, used for neural networks
from keras.layers import LSTM, Dense, Dropout # Deep learning classes for recurrent and regular densely-connected layers
from keras.callbacks import EarlyStopping # EarlyStopping during model training
from sklearn.preprocessing import RobustScaler, MinMaxScaler # This Scaler removes the median and scales the data according to the quantile range to normalize the price data 
import seaborn as sns

# from pandas_datareader.nasdaq_trader import get_nasdaq_symbols
# symbols = get_nasdaq_symbols()

# Setting the timeframe for the data extraction
today =
date_today = today.strftime("%Y-%m-%d")
date_start = '2010-01-01'

# Getting NASDAQ quotes
stockname = 'Apple'
symbol = 'AAPL'
# df = webreader.DataReader(
#     symbol, start=date_start, end=date_today, data_source="yahoo"
# )

import yfinance as yf #Alternative package if webreader does not work: pip install yfinance
df =, start=date_start, end=date_today)

# # Create a quick overview of the dataset

The data should comprise the following columns:

  • Close
  • Open
  • High
  • Low
  • Adj Close
  • Volume

The target variable that we are trying to predict is the Closing price (Close).

Step #2: Explore the Data

Once we have loaded the data, we print a quick overview of the time-series data using different line graphs.

# Plot line charts
df_plot = df.copy()

ncols = 2
nrows = int(round(df_plot.shape[1] / ncols, 0))

fig, ax = plt.subplots(nrows=nrows, ncols=ncols, sharex=True, figsize=(14, 7))
for i, ax in enumerate(fig.axes):
        sns.lineplot(data = df_plot.iloc[:, i], ax=ax)
        ax.tick_params(axis="x", rotation=30, labelsize=10, length=0)
The Apple Stock's historical price data, including quotes, highs, lows, and volume

The line plots look as expected and reflect the Apple stock price history.

Step #3: Preprocess the Data

Next, we prepare the data for the training process. Preparing the data for a multivariate regression model involves several steps:

  • Selecting features for model training
  • Scaling and splitting the data into training and testing
  • Slicing the time series into several shifted training batches

We begin by creating a copy of the initial data and resetting the index.

# Indexing Batches
df_train = df.sort_values(by=['Date']).copy()

# We safe a copy of the dates index, before we need to reset it to numbers
date_index = df_train.index

# We reset the index, so we can convert the date-index to a number-index
df_train = df_train.reset_index(drop=True).copy()
Initial dataset before transformation and model training

We proceed with feature selection. To keep things simple, we will use the features from the input data without any modifications. After selecting the features, we scale them to a range between 0 and 1. To ease unscaling the predictions after training, we create two different scalers: One for the training data, which takes five columns, and one for the output data that scales a single column (the Close Price). I have covered feature engineering in more detail in a separate article if you want to learn more about this topic.

def prepare_data(df):

    # List of considered Features
    FEATURES = ['Open', 'High', 'Low', 'Close', 'Volume']

    print('FEATURE LIST')
    print([f for f in FEATURES])

    # Create the dataset with features and filter the data to the list of FEATURES
    df_filter = df[FEATURES]
    # Convert the data to numpy values
    np_filter_unscaled = np.array(df_filter)
    #np_filter_unscaled = np.reshape(np_unscaled, (df_filter.shape[0], -1))

    np_c_unscaled = np.array(df['Close']).reshape(-1, 1)
    return np_filter_unscaled, np_c_unscaled
np_filter_unscaled, np_c_unscaled = prepare_data(df_train)
# Creating a separate scaler that works on a single column for scaling predictions
# Scale each feature to a range between 0 and 1
scaler_train = MinMaxScaler()
np_scaled = scaler_train.fit_transform(np_filter_unscaled)
# Create a separate scaler for a single column
scaler_pred = MinMaxScaler()
np_scaled_c = scaler_pred.fit_transform(np_c_unscaled)   

The final step of the data preparation is to create the structure for the input data. This structure needs to match the input layer of the model architecture.

Running the code below starts a sliding window script that cuts the initial time series data into multiple slices, i.e., mini-batches. Each batch is a smaller fraction of the initial time series shifted by a single step. Because we will feed our model with multivariate input data, the time series consists of five input columns/features. Each batch comprises a period of 50 steps from the time series and an output sequence of ten consecutive values. To validate that the batches have the right shape, we visualize mini-batches in a line graph with their consecutive target values.

# Set the input_sequence_length length - this is the timeframe used to make a single prediction
input_sequence_length = 50
# The output sequence length is the number of steps that the neural network predicts
output_sequence_length = 10 #

# Prediction Index
index_Close = df_train.columns.get_loc("Close")

# Split the training data into train and train data sets
# As a first step, we get the number of rows to train the model on 80% of the data 
train_data_length = math.ceil(np_scaled.shape[0] * 0.8)

# Create the training and test data
train_data = np_scaled[:train_data_length, :]
test_data = np_scaled[train_data_length - input_sequence_length:, :]

# The RNN needs data with the format of [samples, time steps, features]
# Here, we create N samples, input_sequence_length time steps per sample, and f features
def partition_dataset(input_sequence_length, output_sequence_length, data):
    x, y = [], []
    data_len = data.shape[0]
    for i in range(input_sequence_length, data_len - output_sequence_length):
        x.append(data[i-input_sequence_length:i,:]) #contains input_sequence_length values 0-input_sequence_length * columns
        y.append(data[i:i + output_sequence_length, index_Close]) #contains the prediction values for validation (3rd column = Close),  for single-step prediction
    # Convert the x and y to numpy arrays
    x = np.array(x)
    y = np.array(y)
    return x, y

# Generate training data and test data
x_train, y_train = partition_dataset(input_sequence_length, output_sequence_length, train_data)
x_test, y_test = partition_dataset(input_sequence_length, output_sequence_length, test_data)

# Print the shapes: the result is: (rows, training_sequence, features) (prediction value, )
print(x_train.shape, y_train.shape)
print(x_test.shape, y_test.shape)

# Validate that the prediction value and the input match up
# The last close price of the second input sample should equal the first prediction value
nrows = 3 # number of shifted plots
fig, ax = plt.subplots(nrows=nrows, ncols=1, figsize=(16, 8))
for i, ax in enumerate(fig.axes):
    xtrain = pd.DataFrame(x_train[i][:,index_Close], columns={f'x_train_{i}'})
    ytrain = pd.DataFrame(y_train[i][:output_sequence_length-1], columns={f'y_train_{i}'})
    ytrain.index = np.arange(input_sequence_length, input_sequence_length + output_sequence_length-1)
    xtrain_ = pd.concat([xtrain, ytrain[:1].rename(columns={ytrain.columns[0]:xtrain.columns[0]})])
    df_merge = pd.concat([xtrain_, ytrain])
    sns.lineplot(data = df_merge, ax=ax)
batch test visualizations, deep neural networks for multi-output stock market forecasting

Step #3: Prepare the Neural Network Architecture and Train the Multi-Output Regression Model

Now that we have the training data prepared and ready, the next step is to configure the architecture of the multi-out neural network. Because we will be using multiple input series, our model is, in fact, a multivariate architecture so that it corresponds to the input training batches.

We choose a comparably simple architecture with only two LSTM layers and two additional dense layers. The first dense layer has 20 neurons, and the second layer is the output layer, which has ten output neurons. If you wonder how I got to the number of neurons in the third layer, I conducted several experiments and found that this number leads to solid results.

To ensure that the architecture matches our input data’s structure, we reuse the variables for the previous code section (n_input_neurons, n_output_neurons. The input sequence length is 50, and the output sequence (the steps for the period we want to predict) is ten.

# Configure the neural network model
model = Sequential()
n_output_neurons = output_sequence_length

# Model with n_neurons = inputshape Timestamps, each with x_train.shape[2] variables
n_input_neurons = x_train.shape[1] * x_train.shape[2]
print(n_input_neurons, x_train.shape[1], x_train.shape[2])
model.add(LSTM(n_input_neurons, return_sequences=True, input_shape=(x_train.shape[1], x_train.shape[2]))) 
model.add(LSTM(n_input_neurons, return_sequences=False))

# Compile the model
model.compile(optimizer='adam', loss='mse')

After configuring the model architecture, we can initiate the training process and illustrate how the loss develops over the training epochs.

# Training the model
epochs = 10
batch_size = 16
early_stop = EarlyStopping(monitor='loss', patience=5, verbose=1)
history =, y_train, 
                    validation_data=(x_test, y_test)
we trained the multi-output model in ten training iterations
# Plot training & validation loss values
fig, ax = plt.subplots(figsize=(10, 5), sharex=True)
plt.title("Model loss")
plt.legend(["Train", "Test"], loc="upper left")
loss curve after training the multi-output neural network, keras, multi-step time series regression

Step #5 Evaluate Model Performance

Now that we have trained the model, we can make forecasts on the test data and use traditional regression metrics such as the MAE, MAPE, or MDAPE to measure the performance of our model.

# Get the predicted values
y_pred_scaled = model.predict(x_test)

# Unscale the predicted values
y_pred = scaler_pred.inverse_transform(y_pred_scaled)
y_test_unscaled = scaler_pred.inverse_transform(y_test).reshape(-1, output_sequence_length)

# Mean Absolute Error (MAE)
MAE = mean_absolute_error(y_test_unscaled, y_pred)
print(f'Median Absolute Error (MAE): {np.round(MAE, 2)}')

# Mean Absolute Percentage Error (MAPE)
MAPE = np.mean((np.abs(np.subtract(y_test_unscaled, y_pred)/ y_test_unscaled))) * 100
print(f'Mean Absolute Percentage Error (MAPE): {np.round(MAPE, 2)} %')

# Median Absolute Percentage Error (MDAPE)
MDAPE = np.median((np.abs(np.subtract(y_test_unscaled, y_pred)/ y_test_unscaled)) ) * 100
print(f'Median Absolute Percentage Error (MDAPE): {np.round(MDAPE, 2)} %')

def prepare_df(i, x, y, y_pred_unscaled):
    # Undo the scaling on x, reshape the testset into a one-dimensional array, so that it fits to the pred scaler
    x_test_unscaled_df = pd.DataFrame(scaler_pred.inverse_transform((x[i]))[:,index_Close]).rename(columns={0:'x_test'})
    y_test_unscaled_df = []
    # Undo the scaling on y
    if type(y) == np.ndarray:
        y_test_unscaled_df = pd.DataFrame(scaler_pred.inverse_transform(y)[i]).rename(columns={0:'y_test'})

    # Create a dataframe for the y_pred at position i, y_pred is already unscaled
    y_pred_df = pd.DataFrame(y_pred_unscaled[i]).rename(columns={0:'y_pred'})
    return x_test_unscaled_df, y_pred_df, y_test_unscaled_df

def plot_multi_test_forecast(x_test_unscaled_df, y_test_unscaled_df, y_pred_df, title): 
    # Package y_pred_unscaled and y_test_unscaled into a dataframe with columns pred and true   
    if type(y_test_unscaled_df) == pd.core.frame.DataFrame:
        df_merge = y_pred_df.join(y_test_unscaled_df, how='left')
        df_merge = y_pred_df.copy()
    # Merge the dataframes 
    df_merge_ = pd.concat([x_test_unscaled_df, df_merge]).reset_index(drop=True)
    # Plot the linecharts
    fig, ax = plt.subplots(figsize=(20, 8))
    plt.title(title, fontsize=12)
    ax.set(ylabel = stockname + "_stock_price_quotes")
    sns.lineplot(data = df_merge_, linewidth=2.0, ax=ax)

# Creates a linechart for a specific test batch_number and corresponding test predictions
batch_number = 50
x_test_unscaled_df, y_pred_df, y_test_unscaled_df = prepare_df(i, x_test, y_test, y_pred)
title = f"Predictions vs y_test - test batch number {batch_number}"
plot_multi_test_forecast(x_test_unscaled_df, y_test_unscaled_df, y_pred_df, title) 
multioutput regression deep neural networks test predictions

The quality of the predictions is acceptable, considering the goal of this architecture was not to create an excellent model. However, there is certainly room for improvement.

Step #6 Create a New Forecast

Finally, let’s create a forecast on a new dataset. We take the scaled dataset from section 2 (np_scaled) and extract a series with the latest 50 values. Then we use these values to generate a new prediction for the next ten days. We visualize the multi-step forecast in another line chart.

# Get the latest input batch from the test dataset, which is contains the price values for the last ten trading days
x_test_latest_batch = np_scaled[-51:-1,:].reshape(1,50,5)

# Predict on the batch
y_pred_scaled = model.predict(x_test_latest_batch)
y_pred_unscaled = scaler_pred.inverse_transform(y_pred_scaled)

# Prepare the data and plot the input data and the predictions
x_test_unscaled_df, y_test_unscaled_df, _ = prepare_df(0, x_test_latest_batch, '', y_pred_unscaled)
plot_multi_test_forecast(x_test_unscaled_df, '', y_pred_df, "x_new Vs. y_new_pred")
multioutput regression deep neural networks


This tutorial has shown how we can use multiple output neural networks to make predictions over different time steps. We discussed the necessary architecture of a recurrent neural network and learned how to process the data accordingly. In addition, we trained a multi-output regression model that predicts ten price steps for Apple stock. Finally, we visualized the multi-step predictions of this model in a line plot.

Feel free to test other hyperparameters and adjust the model architecture. You can also increase the prediction horizon by adding more neurons to the output layers. But keep in mind that prediction error will increase with the prediction horizon. 

I hope this article was helpful in understanding multi-output neural networks better. If you have any questions or comments, please let me know.


  • Hi, I am Florian, a Zurich-based consultant for AI and Data. Since the completion of my Ph.D. in 2017, I have been working on the design and implementation of ML use cases in the Swiss financial sector. I started this blog in 2020 with the goal in mind to share my experiences and create a place where you can find key concepts of machine learning and materials that will allow you to kick-start your own Python projects.

5 thoughts on “Stock Market Prediction using Neural Networks for Multi-Output Regression in Python”

  1. So, first, this IS a useful article how to work with data. However, it seems to be a terrible architecture design for the task at hand. There needs to be a lot more custom functions in the neural design not just “oh use LSTM”

  2. Thanks for sharing your experience with the synthetic data! Just as with single outputs it’s difficult to say in advance which parameters will lead to good results. There is often no way around conducting experiments. Two things you could try are modifying the hidden layers and the length of the input periods. I’ll also experiment with the data as soon as I get back from vacation. 🙂

  3. hi, I ran this code using your Sine function from an earlier example. So I imported the date as per your code (from Yahoo) and then below that I added:

    steps = df.shape[0]
    gradient = 0.02
    list_a = []
    for i in range(0, steps, 1):
    y = round(gradient * i + math.sin(math.pi * 0.125 * i), 5)

    df2 = pd.DataFrame({“Close”: list_a,
    “Open”: list_a,
    “High”: list_a,
    “Low”: list_a,
    “Volume”: list_a,
    “Adj Close”: list_a},
    columns=[“Close”,”Open”,”High”,”Low”,”Volume”,”Adj Close”])

    df2.index = df.index
    df = df2

    so the original dataframe df from Yahoo is now filled with artificial data keeping the same dates. The results are not very good if you compare it to your example given here:

    Even if I increase the number of epochs. Is this to be expected when using multiple nodes in the output layer?


Leave a Reply