Stock Price Trend Prediction Using Neural Network (Pytorch)

This is not financial advice.

The aim of this notebook is to check whether it is possible to predict/evaluate stock price trend given set of features derived from historical priceaction.

Here we will be using modified neural net from:

Both these articles are evaluating Iris Dataset for flower species classification. Turns out that Iris dataset has very similar structure to our training data for stocks, hence the code will be more or less reusable.

The article from blog.quantisti.com

We will be using neural network instead and will have also much longer time horizon for trend predictions.

How the neural net is trained:

  • get data for multiple stocks from Yahoo Finance API
  • compute various stock trading indicators
  • use them directly and/or transform them into features
  • feed features into neural network
  • split data into train and test dataset
  • save trained network with joblib library
  • ???
  • profit

Specifically this is how our training/target condition looks like:

# price above trend multiple days later df['target_cls'] = np.where(df['Adj Close'].shift(-34) > df.EMA150.shift(-34), 1, 0)

We are performing classification task (logistic regression).

The output of the neural net will be 1 or 0 (Buy or Not Buy). Based on given features the network will be trying to predict whether price will be in n days above specific moving average. For example as shown above - in 34 days above 150 Exponencial Moving Average.

The neural net will never be trained on the specific moving average it is trying to predict, it will always use different input features.

Get stock data

In [3]:
import talib as ta
import joblib
In [4]:
import pandas as pd

#suppress 'SettingWithCopy' warning
pd.set_option('mode.chained_assignment', None)
In [5]:
#!pip install pandas_datareader
#!pip3 install seaborn
import seaborn as sns
In [6]:
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))

# ___library_import_statements___
import pandas as pd

# for pandas_datareader, otherwise it might have issues, sometimes there is some version mismatch
pd.core.common.is_list_like = pd.api.types.is_list_like

# make pandas to print dataframes nicely
pd.set_option('expand_frame_repr', False)  

import pandas_datareader.data as web
import numpy as np
import matplotlib.pyplot as plt
import datetime
import time

#newest yahoo API 
import yfinance as yahoo_finance

#optional 
#yahoo_finance.pdr_override()

%matplotlib inline
In [7]:
import talib as ta
import numpy as np

import matplotlib.pyplot as plt

# was giving me some warnings
from pandas.plotting import register_matplotlib_converters
register_matplotlib_converters()
In [8]:
# ___variables___
#ticker = 'AAPL'
#ticker = 'TSLA'
#ticker = 'FB'
#ticker = 'MSFT'
#ticker = 'NFLX'
#ticker = 'GOOGL'
ticker = 'BIDU'
#ticker = 'AMZN'
#ticker = 'IBM'

start_time = datetime.datetime(1980, 1, 1)
#end_time = datetime.datetime(2019, 1, 20)
end_time = datetime.datetime.now().date().isoformat()         # today
In [9]:
def get_data(ticker):
    # yahoo gives only daily historical data
    connected = False
    while not connected:
        try:
            df = web.get_data_yahoo(ticker, start=start_time, end=end_time)
            connected = True
            print('connected to yahoo')
        except Exception as e:
            print("type error: " + str(e))
            time.sleep( 5 )
            pass   

    # use numerical integer index instead of date    
    df = df.reset_index()
    #print(df.head(5))
    return df
In [10]:
#df = get_data(ticker)

Compute various stock technical indicators

For each stock we compute several technical indicators, we use mainly exponencial moving averages, Bollinger Bands, RSI and so on. We will then feed these into neural network as features (or values derived from these indicators).

In [11]:
def compute_technical_indicators(df):
    df['EMA5'] = ta.EMA(df['Adj Close'].values, timeperiod=5)
    df['EMA10'] = ta.EMA(df['Adj Close'].values, timeperiod=10)
    df['EMA15'] = ta.EMA(df['Adj Close'].values, timeperiod=15)
    df['EMA20'] = ta.EMA(df['Adj Close'].values, timeperiod=10)
    df['EMA30'] = ta.EMA(df['Adj Close'].values, timeperiod=30)
    df['EMA40'] = ta.EMA(df['Adj Close'].values, timeperiod=40)
    df['EMA50'] = ta.EMA(df['Adj Close'].values, timeperiod=50)

    df['EMA60'] = ta.EMA(df['Adj Close'].values, timeperiod=60)
    df['EMA70'] = ta.EMA(df['Adj Close'].values, timeperiod=70)
    df['EMA80'] = ta.EMA(df['Adj Close'].values, timeperiod=80)
    df['EMA90'] = ta.EMA(df['Adj Close'].values, timeperiod=90)
    
    df['EMA100'] = ta.EMA(df['Adj Close'].values, timeperiod=100)
    df['EMA150'] = ta.EMA(df['Adj Close'].values, timeperiod=150)
    df['EMA200'] = ta.EMA(df['Adj Close'].values, timeperiod=200)

    df['upperBB'], df['middleBB'], df['lowerBB'] = ta.BBANDS(df['Adj Close'].values, timeperiod=20, nbdevup=2, nbdevdn=2, matype=0)

    df['SAR'] = ta.SAR(df['High'].values, df['Low'].values, acceleration=0.02, maximum=0.2)

    # we will normalize RSI
    df['RSI'] = ta.RSI(df['Adj Close'].values, timeperiod=14)

    df['normRSI'] = ta.RSI(df['Adj Close'].values, timeperiod=14) / 100.0
    
    df.tail()

    return df
In [12]:
#df = compute_technical_indicators(df)
In [13]:
def compute_features(df):
    # computes features for forest decisions
    df['aboveEMA5'] = np.where(df['Adj Close'] > df['EMA5'], 1, 0)
    df['aboveEMA10'] = np.where(df['Adj Close'] > df['EMA10'], 1, 0)
    df['aboveEMA15'] = np.where(df['Adj Close'] > df['EMA15'], 1, 0)
    df['aboveEMA20'] = np.where(df['Adj Close'] > df['EMA20'], 1, 0)
    df['aboveEMA30'] = np.where(df['Adj Close'] > df['EMA30'], 1, 0)
    df['aboveEMA40'] = np.where(df['Adj Close'] > df['EMA40'], 1, 0)
    
    df['aboveEMA50'] = np.where(df['Adj Close'] > df['EMA50'], 1, 0)
    df['aboveEMA60'] = np.where(df['Adj Close'] > df['EMA60'], 1, 0)
    df['aboveEMA70'] = np.where(df['Adj Close'] > df['EMA70'], 1, 0)
    df['aboveEMA80'] = np.where(df['Adj Close'] > df['EMA80'], 1, 0)
    df['aboveEMA90'] = np.where(df['Adj Close'] > df['EMA90'], 1, 0)
    
    df['aboveEMA100'] = np.where(df['Adj Close'] > df['EMA100'], 1, 0)
    df['aboveEMA150'] = np.where(df['Adj Close'] > df['EMA150'], 1, 0)
    df['aboveEMA200'] = np.where(df['Adj Close'] > df['EMA200'], 1, 0)

    df['aboveUpperBB'] = np.where(df['Adj Close'] > df['upperBB'], 1, 0)
    df['belowLowerBB'] = np.where(df['Adj Close'] < df['lowerBB'], 1, 0)
    
    df['aboveSAR'] = np.where(df['Adj Close'] > df['SAR'], 1, 0)
   
    df['oversoldRSI'] = np.where(df['RSI'] < 30, 1, 0)
    df['overboughtRSI'] = np.where(df['RSI'] > 70, 1, 0)


    # very important - cleanup NaN values, otherwise prediction does not work
    df=df.fillna(0).copy()
    
    df.tail()

    return df
In [14]:
#df = compute_features(df)
In [15]:
def plot_train_data(df):
    # plot price
    plt.figure(figsize=(15,2.5))
    plt.title('Stock data ' + str(ticker))
    plt.plot(df['Date'], df['Adj Close'])
    #plt.title('Price chart (Adj Close) ' + str(ticker))
    plt.show()
    return None
In [16]:
def define_target_condition(df):
 
    # price higher later - bad predictive results
    #df['target_cls'] = np.where(df['Adj Close'].shift(-34) > df['Adj Close'], 1, 0)    
    
    # price above trend multiple days later
    df['target_cls'] = np.where(df['Adj Close'].shift(-34) > df.EMA150.shift(-34), 1, 0)

    # important, remove NaN values
    df=df.fillna(0).copy()
    
    df.tail()
    
    return df
In [17]:
#df = define_target_condition(df)
In [18]:
#plot_train_data(df)

Create one big training dataframe

neural network will be trained on this dataframe. Data will be split eventually into training and testing set.

In [19]:
tickers = ['F', 'IBM', 'GE', 'AAPL', 'ADM',
           'XOM', 'GM','MMM','KO','PEP','SO','GS']           
#           'HAS','PEAK','HPE','HLT','HD','HON','HRL','HST','HPQ','HUM','ILMN', 
#           'INTC','ICE','INTU','ISRG','IVZ','IRM','JNJ','JPM','JNPR','K','KMB', 
#           'KIM', 'KMI','KSS','KHC', 'KR',  'LB', 'LEG', 'LIN', 'LMT','LOW',
#           'MAR', 'MA','MCD','MDT', 'MRK', 'MET', 'MGM', 'MU','MSFT', 'MAA', 
#           'MNST', 'MCO','MS', 'MSI',
#           'MMM', 'ABT','ACN','ATVI','ADBE','AMD','A','AKAM','ARE','GOOG','AMZN','AAL']
In [20]:
# parent dataframe to append to
ticker = 'SPY'
df = get_data(ticker)
df = compute_technical_indicators(df)
df = compute_features(df)
df = define_target_condition(df)

for ticker in tickers:
    t_df = get_data(ticker)
    t_df = compute_technical_indicators(t_df)
    t_df = compute_features(t_df)
    t_df = define_target_condition(t_df)
    
    df = df.append(t_df, ignore_index=True)
connected to yahoo
connected to yahoo
connected to yahoo
connected to yahoo
connected to yahoo
connected to yahoo
connected to yahoo
connected to yahoo
connected to yahoo
connected to yahoo
connected to yahoo
connected to yahoo
connected to yahoo
In [ ]:
 

Train-Test split and Training part

In [21]:
predictors_list = ['aboveSAR','aboveUpperBB','belowLowerBB','normRSI','oversoldRSI','overboughtRSI',
                   'aboveEMA5','aboveEMA10','aboveEMA15','aboveEMA20','aboveEMA30','aboveEMA40',
                   'aboveEMA50','aboveEMA60','aboveEMA70','aboveEMA80','aboveEMA90','aboveEMA100']
In [22]:
def splitting_and_training(df, predictors_list, test_size=0.3):
    # __predictors__


    # __features__
    X = df[predictors_list].fillna(0).values
    #X.tail()

    # __targets__
    y_cls = df.target_cls.fillna(0).values
    #y_cls.tail(10)

    # __train test split__
    from sklearn.model_selection import train_test_split
    y=y_cls
    X_cls_train, X_cls_test, y_cls_train, y_cls_test = train_test_split(X, y, test_size=test_size, random_state=42, stratify=y)

    print (X_cls_train.shape, y_cls_train.shape)
    print (X_cls_test.shape, y_cls_test.shape)

    return X_cls_train, X_cls_test, y_cls_train, y_cls_test
In [ ]:
 
In [23]:
############ START OF MAIN SOURCE FROM KAGGLE ###############
In [24]:
import numpy as np
import torch
from torch import nn
from torch.autograd import Variable
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from keras.utils import to_categorical
import torch.nn.functional as F
Using TensorFlow backend.
/home/coil/anaconda3/envs/decision_trees/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:523: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
/home/coil/anaconda3/envs/decision_trees/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:524: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
/home/coil/anaconda3/envs/decision_trees/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
/home/coil/anaconda3/envs/decision_trees/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
/home/coil/anaconda3/envs/decision_trees/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
/home/coil/anaconda3/envs/decision_trees/lib/python3.5/site-packages/tensorflow/python/framework/dtypes.py:532: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'.
  np_resource = np.dtype([("resource", np.ubyte, 1)])
In [25]:
class Model(nn.Module):
    def __init__(self, input_dim):
        super(Model, self).__init__()
        self.layer1 = nn.Linear(input_dim,100)
        self.layer2 = nn.Linear(100, 30)
        self.layer3 = nn.Linear(30, 2)
        self.drop = nn.Dropout(0.2)
        
    def forward(self, x):
        x = F.relu(self.layer1(x))
        x = self.drop(x)
        x = F.relu(self.layer2(x))
        x = self.drop(x)
        x = F.softmax(self.layer3(x)) # To check with the loss function
        return x

Features, Labels

In [26]:
#features, labels = load_iris(return_X_y=True)
In [27]:
#features[:3]
In [28]:
#labels[:3]
In [29]:
#features_train,features_test, labels_train, labels_test = train_test_split(features, labels, random_state=42, shuffle=True)
In [30]:
# mine version
# so far the variables are dataframes, not arrays or tensors
features_train,features_test, labels_train, labels_test = splitting_and_training(df, predictors_list)
(81925, 18) (81925,)
(35112, 18) (35112,)
In [31]:
features_train[:3]
Out[31]:
array([[0.        , 0.        , 0.        , 0.60373153, 0.        ,
        0.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        ],
       [0.        , 1.        , 0.        , 0.64858266, 0.        ,
        0.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        ],
       [0.        , 0.        , 0.        , 0.57025818, 0.        ,
        0.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        , 1.        , 1.        ,
        1.        , 1.        , 1.        ]])
In [32]:
labels_train[:3]
Out[32]:
array([1, 1, 1])
In [33]:
# make data tensors
features_train = Variable(torch.Tensor(features_train).float())
features_test  = Variable(torch.Tensor(features_test).float())
labels_train   = Variable(torch.Tensor(labels_train).long())
labels_test    = Variable(torch.Tensor(labels_test).long())

x_train = features_train
y_train = labels_train
In [ ]:
 
In [34]:
# Training
model = Model(features_train.shape[1])
optimizer = torch.optim.Adam(model.parameters(), lr=0.005)
loss_fn = nn.CrossEntropyLoss()
epochs = 150

def print_(loss):
    print ("The loss calculated: ", loss)

Actual training using several epochs

In [35]:
# Not using dataloader
#######x_train, y_train = Variable(torch.from_numpy(features_train)).float(), Variable(torch.from_numpy(labels_train)).long()
for epoch in range(1, epochs+1):
    print ("Epoch #",epoch)
    y_pred = model(x_train)
    loss = loss_fn(y_pred, y_train)
    print_(loss.item())
    
    # Zero gradients
    optimizer.zero_grad()
    loss.backward() # Gradients
    optimizer.step() # Update
Epoch # 1
/home/coil/anaconda3/envs/decision_trees/lib/python3.5/site-packages/ipykernel_launcher.py:14: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
  
The loss calculated:  0.6899831891059875
Epoch # 2
The loss calculated:  0.6668872833251953
Epoch # 3
The loss calculated:  0.6473737955093384
Epoch # 4
The loss calculated:  0.6288973093032837
Epoch # 5
The loss calculated:  0.6122158169746399
Epoch # 6
The loss calculated:  0.5986753106117249
Epoch # 7
The loss calculated:  0.5895068645477295
Epoch # 8
The loss calculated:  0.5843745470046997
Epoch # 9
The loss calculated:  0.5815328359603882
Epoch # 10
The loss calculated:  0.579129695892334
Epoch # 11
The loss calculated:  0.5768385529518127
Epoch # 12
The loss calculated:  0.5740838646888733
Epoch # 13
The loss calculated:  0.5708081126213074
Epoch # 14
The loss calculated:  0.5679550766944885
Epoch # 15
The loss calculated:  0.5655806660652161
Epoch # 16
The loss calculated:  0.5642478466033936
Epoch # 17
The loss calculated:  0.5632559061050415
Epoch # 18
The loss calculated:  0.5633893013000488
Epoch # 19
The loss calculated:  0.5634169578552246
Epoch # 20
The loss calculated:  0.5633461475372314
Epoch # 21
The loss calculated:  0.5636658072471619
Epoch # 22
The loss calculated:  0.5637633204460144
Epoch # 23
The loss calculated:  0.5638047456741333
Epoch # 24
The loss calculated:  0.5638378858566284
Epoch # 25
The loss calculated:  0.5633932948112488
Epoch # 26
The loss calculated:  0.5633214712142944
Epoch # 27
The loss calculated:  0.5627013444900513
Epoch # 28
The loss calculated:  0.5621302127838135
Epoch # 29
The loss calculated:  0.5618124604225159
Epoch # 30
The loss calculated:  0.5615593791007996
Epoch # 31
The loss calculated:  0.5613817572593689
Epoch # 32
The loss calculated:  0.5613062381744385
Epoch # 33
The loss calculated:  0.5610712170600891
Epoch # 34
The loss calculated:  0.5608690977096558
Epoch # 35
The loss calculated:  0.5609871745109558
Epoch # 36
The loss calculated:  0.5610141158103943
Epoch # 37
The loss calculated:  0.5605411529541016
Epoch # 38
The loss calculated:  0.5602871775627136
Epoch # 39
The loss calculated:  0.5602853894233704
Epoch # 40
The loss calculated:  0.5600144863128662
Epoch # 41
The loss calculated:  0.5598403215408325
Epoch # 42
The loss calculated:  0.5599870681762695
Epoch # 43
The loss calculated:  0.5599676370620728
Epoch # 44
The loss calculated:  0.5599595904350281
Epoch # 45
The loss calculated:  0.5598176717758179
Epoch # 46
The loss calculated:  0.5597584247589111
Epoch # 47
The loss calculated:  0.5594111084938049
Epoch # 48
The loss calculated:  0.559596598148346
Epoch # 49
The loss calculated:  0.559237003326416
Epoch # 50
The loss calculated:  0.5590981841087341
Epoch # 51
The loss calculated:  0.5590132474899292
Epoch # 52
The loss calculated:  0.5591598153114319
Epoch # 53
The loss calculated:  0.5590144991874695
Epoch # 54
The loss calculated:  0.5590764284133911
Epoch # 55
The loss calculated:  0.5590685606002808
Epoch # 56
The loss calculated:  0.5589088201522827
Epoch # 57
The loss calculated:  0.5589266419410706
Epoch # 58
The loss calculated:  0.5589046478271484
Epoch # 59
The loss calculated:  0.5589277148246765
Epoch # 60
The loss calculated:  0.5584126114845276
Epoch # 61
The loss calculated:  0.5587561726570129
Epoch # 62
The loss calculated:  0.5587317943572998
Epoch # 63
The loss calculated:  0.5585729479789734
Epoch # 64
The loss calculated:  0.558568000793457
Epoch # 65
The loss calculated:  0.5585795044898987
Epoch # 66
The loss calculated:  0.5583360195159912
Epoch # 67
The loss calculated:  0.5584455132484436
Epoch # 68
The loss calculated:  0.5583195090293884
Epoch # 69
The loss calculated:  0.5583729147911072
Epoch # 70
The loss calculated:  0.558131992816925
Epoch # 71
The loss calculated:  0.55841463804245
Epoch # 72
The loss calculated:  0.5583000183105469
Epoch # 73
The loss calculated:  0.558402955532074
Epoch # 74
The loss calculated:  0.5582301020622253
Epoch # 75
The loss calculated:  0.5581138134002686
Epoch # 76
The loss calculated:  0.5580995678901672
Epoch # 77
The loss calculated:  0.5583104491233826
Epoch # 78
The loss calculated:  0.5583242774009705
Epoch # 79
The loss calculated:  0.558099091053009
Epoch # 80
The loss calculated:  0.5581637024879456
Epoch # 81
The loss calculated:  0.5580411553382874
Epoch # 82
The loss calculated:  0.5580089688301086
Epoch # 83
The loss calculated:  0.5581194162368774
Epoch # 84
The loss calculated:  0.5578945875167847
Epoch # 85
The loss calculated:  0.5582108497619629
Epoch # 86
The loss calculated:  0.5579631924629211
Epoch # 87
The loss calculated:  0.557951807975769
Epoch # 88
The loss calculated:  0.5582349300384521
Epoch # 89
The loss calculated:  0.5580791234970093
Epoch # 90
The loss calculated:  0.5580912232398987
Epoch # 91
The loss calculated:  0.557822585105896
Epoch # 92
The loss calculated:  0.5579474568367004
Epoch # 93
The loss calculated:  0.5580459833145142
Epoch # 94
The loss calculated:  0.5578931570053101
Epoch # 95
The loss calculated:  0.557796835899353
Epoch # 96
The loss calculated:  0.5579246282577515
Epoch # 97
The loss calculated:  0.5579365491867065
Epoch # 98
The loss calculated:  0.5578290820121765
Epoch # 99
The loss calculated:  0.5577880144119263
Epoch # 100
The loss calculated:  0.5579425692558289
Epoch # 101
The loss calculated:  0.5579735040664673
Epoch # 102
The loss calculated:  0.5579021573066711
Epoch # 103
The loss calculated:  0.557834804058075
Epoch # 104
The loss calculated:  0.5576907992362976
Epoch # 105
The loss calculated:  0.5579015016555786
Epoch # 106
The loss calculated:  0.5578812956809998
Epoch # 107
The loss calculated:  0.5578544735908508
Epoch # 108
The loss calculated:  0.5578688383102417
Epoch # 109
The loss calculated:  0.5577183961868286
Epoch # 110
The loss calculated:  0.5578265190124512
Epoch # 111
The loss calculated:  0.5578633546829224
Epoch # 112
The loss calculated:  0.5576661229133606
Epoch # 113
The loss calculated:  0.5575401186943054
Epoch # 114
The loss calculated:  0.5577540993690491
Epoch # 115
The loss calculated:  0.5577229261398315
Epoch # 116
The loss calculated:  0.557586669921875
Epoch # 117
The loss calculated:  0.557704508304596
Epoch # 118
The loss calculated:  0.5576503872871399
Epoch # 119
The loss calculated:  0.5576353073120117
Epoch # 120
The loss calculated:  0.5576555132865906
Epoch # 121
The loss calculated:  0.5576655268669128
Epoch # 122
The loss calculated:  0.5577729940414429
Epoch # 123
The loss calculated:  0.5577083826065063
Epoch # 124
The loss calculated:  0.5575713515281677
Epoch # 125
The loss calculated:  0.5575408339500427
Epoch # 126
The loss calculated:  0.5576742887496948
Epoch # 127
The loss calculated:  0.5575987100601196
Epoch # 128
The loss calculated:  0.5574367046356201
Epoch # 129
The loss calculated:  0.5576167106628418
Epoch # 130
The loss calculated:  0.557499885559082
Epoch # 131
The loss calculated:  0.5573645234107971
Epoch # 132
The loss calculated:  0.5574638843536377
Epoch # 133
The loss calculated:  0.5576833486557007
Epoch # 134
The loss calculated:  0.5575346350669861
Epoch # 135
The loss calculated:  0.5574941039085388
Epoch # 136
The loss calculated:  0.557369589805603
Epoch # 137
The loss calculated:  0.5574191212654114
Epoch # 138
The loss calculated:  0.557375967502594
Epoch # 139
The loss calculated:  0.5574331283569336
Epoch # 140
The loss calculated:  0.5572737455368042
Epoch # 141
The loss calculated:  0.5574622750282288
Epoch # 142
The loss calculated:  0.5573206543922424
Epoch # 143
The loss calculated:  0.5574856996536255
Epoch # 144
The loss calculated:  0.5575013160705566
Epoch # 145
The loss calculated:  0.5574091672897339
Epoch # 146
The loss calculated:  0.557424008846283
Epoch # 147
The loss calculated:  0.557168185710907
Epoch # 148
The loss calculated:  0.5572803616523743
Epoch # 149
The loss calculated:  0.5574122071266174
Epoch # 150
The loss calculated:  0.5571383237838745
In [36]:
# Prediction
######x_test = Variable(torch.from_numpy(features_test)).float()
x_test = features_test

pred = model(x_test)
/home/coil/anaconda3/envs/decision_trees/lib/python3.5/site-packages/ipykernel_launcher.py:14: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
  
In [37]:
pred = pred.detach().numpy()
In [38]:
pred
Out[38]:
array([[9.1468744e-02, 9.0853125e-01],
       [9.4404088e-05, 9.9990559e-01],
       [6.0285771e-01, 3.9714229e-01],
       ...,
       [9.9863321e-01, 1.3668122e-03],
       [5.7456392e-01, 4.2543608e-01],
       [1.5819230e-11, 1.0000000e+00]], dtype=float32)

Accuracy evaluation

In [39]:
print ("The accuracy is", accuracy_score(labels_test, np.argmax(pred, axis=1)))
The accuracy is 0.7342503987240829
In [40]:
# Checking for first value
np.argmax(model(x_test[0]).detach().numpy(), axis=0)
/home/coil/anaconda3/envs/decision_trees/lib/python3.5/site-packages/ipykernel_launcher.py:14: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
  
Out[40]:
1
In [41]:
labels_test[0]
Out[41]:
tensor(0)
In [42]:
torch.save(model, "iris-pytorch.pkl")
/home/coil/anaconda3/envs/decision_trees/lib/python3.5/site-packages/torch/serialization.py:402: UserWarning: Couldn't retrieve source code for container of type Model. It won't be checked for correctness upon loading.
  "type " + obj.__name__ + ". It won't be checked "
In [43]:
saved_model = torch.load("iris-pytorch.pkl")
In [44]:
np.argmax(saved_model(x_test[0]).detach().numpy(), axis=0)
/home/coil/anaconda3/envs/decision_trees/lib/python3.5/site-packages/ipykernel_launcher.py:14: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
  
Out[44]:
1
In [45]:
x_test[0]
Out[45]:
tensor([1.0000, 0.0000, 0.0000, 0.5639, 0.0000, 0.0000, 1.0000, 1.0000, 1.0000,
        1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000])
In [46]:
for i in x_test[:3]:
    print(i)
    prediction = np.argmax(saved_model(i).detach().numpy(), axis=0)
    print('prediction', prediction)
tensor([1.0000, 0.0000, 0.0000, 0.5639, 0.0000, 0.0000, 1.0000, 1.0000, 1.0000,
        1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000])
prediction 1
tensor([0.0000, 1.0000, 0.0000, 0.6819, 0.0000, 0.0000, 1.0000, 1.0000, 1.0000,
        1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000])
prediction 1
tensor([0.0000, 0.0000, 0.0000, 0.4684, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
        0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000])
prediction 0
/home/coil/anaconda3/envs/decision_trees/lib/python3.5/site-packages/ipykernel_launcher.py:14: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
  
In [ ]:
 
In [47]:
############ END OF MAIN SOURCE FROM KAGGLE ###############
In [ ]:
 

Prediction on unknown data

Let's provide the model with new stock data it was not trained on to see how it performs.

In [72]:
#ticker='BP'
#ticker='ABBV'
#ticker='GILD'
#ticker='NGG'
#ticker='BPY'
ticker='AIR'
In [73]:
def plot_stock_prediction(df, ticker):
    # plot  values and significant levels
    plt.figure(figsize=(30,7))
    plt.title('Predictive model ' + str(ticker))
    plt.plot(df['Date'], df['Adj Close'], label='Adj Close', alpha=0.2)

    plt.plot(df['Date'], df['EMA10'], label='EMA10', alpha=0.2)
    plt.plot(df['Date'], df['EMA20'], label='EMA20', alpha=0.2)
    plt.plot(df['Date'], df['EMA30'], label='EMA30', alpha=0.2)
    plt.plot(df['Date'], df['EMA40'], label='EMA40', alpha=0.2)
    plt.plot(df['Date'], df['EMA50'], label='EMA50', alpha=0.2)
    plt.plot(df['Date'], df['EMA100'], label='EMA100', alpha=0.2)
    plt.plot(df['Date'], df['EMA150'], label='EMA150', alpha=0.99)
    plt.plot(df['Date'], df['EMA200'], label='EMA200', alpha=0.2)


    plt.scatter(df['Date'], df['Buy']*df['Adj Close'], label='Buy', marker='^', color='magenta', alpha=0.15)
    #lt.scatter(df.index, df['sell_sig'], label='Sell', marker='v')

    plt.legend()

    plt.show()

    return None   
In [74]:
new_df = get_data(ticker)
connected to yahoo
In [75]:
new_df = compute_technical_indicators(new_df)
In [76]:
new_df = compute_features(new_df)
In [77]:
new_df=define_target_condition(new_df)
In [78]:
saved_model = torch.load("iris-pytorch.pkl")
In [79]:
def predict_timeseries(df):
    
    # making sure we have good dimensions
    # column will be rewritten later
    df['Buy'] = df['target_cls']
    
    for i in range(len(df)):
        X_cls_valid = [[df['aboveSAR'][i],df['aboveUpperBB'][i],df['belowLowerBB'][i],
                        df['normRSI'][i],df['oversoldRSI'][i],df['overboughtRSI'][i],
                        df['aboveEMA5'][i],df['aboveEMA10'][i],df['aboveEMA15'][i],df['aboveEMA20'][i],
                        df['aboveEMA30'][i],df['aboveEMA40'][i],df['aboveEMA50'][i],
                        df['aboveEMA60'][i],df['aboveEMA70'][i],df['aboveEMA80'][i],df['aboveEMA90'][i],
                        df['aboveEMA100'][i]]]    

        x_test = Variable(torch.Tensor(X_cls_valid).float())    
        
        #####print('x_test',x_test)
        

        #####print('i',i)
        prediction = np.argmax(saved_model(x_test[0]).detach().numpy(), axis=0)
        #####print('prediction', prediction)        
  
    
        df['Buy'][i] = prediction


    print(df.head())    
        
    return df
In [ ]:
 
In [80]:
new_df = predict_timeseries(new_df)
/home/coil/anaconda3/envs/decision_trees/lib/python3.5/site-packages/ipykernel_launcher.py:14: UserWarning: Implicit dimension choice for softmax has been deprecated. Change the call to include dim=X as an argument.
  
        Date      High       Low  Open     Close   Volume  Adj Close      EMA5  EMA10  EMA15  ...  aboveEMA100  aboveEMA150  aboveEMA200  aboveUpperBB  belowLowerBB  aboveSAR  oversoldRSI  overboughtRSI  target_cls  Buy
0 1980-03-17  3.925926  3.814815   0.0  3.888889   6300.0   2.220627  0.000000    0.0    0.0  ...            0            0            0             0             0         0            0              0           1    0
1 1980-03-18  3.777778  3.592592   0.0  3.703704  12100.0   2.114883  0.000000    0.0    0.0  ...            0            0            0             0             0         0            0              0           1    0
2 1980-03-19  3.666667  3.629630   0.0  3.629630   6000.0   2.072587  0.000000    0.0    0.0  ...            0            0            0             0             0         0            0              0           1    0
3 1980-03-20  3.629630  3.481482   0.0  3.481482   8700.0   1.987990  0.000000    0.0    0.0  ...            0            0            0             0             0         0            0              0           1    0
4 1980-03-21  3.555556  3.481482   0.0  3.518518  12300.0   2.009139  2.081045    0.0    0.0  ...            0            0            0             0             0         0            0              0           1    0

[5 rows x 48 columns]
In [81]:
plot_stock_prediction(new_df, ticker)
In [86]:
# zoom in on the data
temp_df = new_df[-3000:-2000]
In [87]:
plot_stock_prediction(temp_df, ticker)

In the testing example (on unseen data) we can see that the model performs very reasonably for identifying overall uptrend.

Note:

when marks are on the price values in the graph, the neural network is signaling 'Buy'. When marks are on 0 level in the graph, network is signalling 'Sell/Don't buy'.