In this article we will get stock price data from an API, temporarily store it in dataframe and persistently move it to a database.
For a database we will choose sqlite3 that is part of standard library in python. There will be no need to install any database or to run any database daemon all the time on our system. Another advantage of sqlite3 is the fact that it makes database backups super easy, the database is just a file and can be moved around and backed up as any other file.
Let's import the necessary packages.
#optional installations:
#!pip install yfinance --upgrade --no-cache-dir
#!pip3 install pandas_datareader
import sqlite3
import pandas as pd
# pretty printing of pandas dataframe
pd.set_option('expand_frame_repr', False)
# for pandas_datareader, sometimes there can be version mismatch
pd.core.common.is_list_like = pd.api.types.is_list_like
import pandas_datareader.data as web
import datetime
import time
#newest yahoo API
import yfinance as yahoo_finance
Call the Yahoo API and store the data in the Pandas dataframe.
# ___variables___
ticker = 'AAPL' # Apple stock
start_time = datetime.datetime(2017, 10, 1)
#end_time = datetime.datetime(2019, 1, 20)
end_time = datetime.datetime.now().date().isoformat() # today
# get daily stock data from yahoo API
connected = False
while not connected:
try:
df = web.get_data_yahoo(ticker, start=start_time, end=end_time)
connected = True
print('connected to yahoo')
except Exception as e:
print("type error: " + str(e))
time.sleep( 5 )
pass
# databases do not like column names that contain spaces
df.rename(columns={"Adj Close": "Adj_Close"}, inplace = True)
# use numerical integer index instead of date
df = df.reset_index()
print(df.head(5))
print(df.tail(5))
The sqlite3.connect
connects to the database and creates it if it does not exist. The cur
will act like the the actual cursor in the command line when we manually interact with the database. We will feed the SQL commands to cur
(to be more precise to cur.execute
).
# ___database_part___
engine = sqlite3.connect('stock_database.db')
cur = engine.cursor()
Exporting whole dataframe into database table is just one line command. The con
directive is using our engine
that is connecting to the database of our selection (in this case stock_database.db
database). In this database it creates table called stock_table
. If such table exists, it is replaced (other options are to append data or to raise an error). The index = True
makes sure we keep the index (row numbers) that we set for the dataframe in previous steps.
# data frame to database
df.to_sql('stock_table', con=engine, if_exists='replace', index = True)
Finally we can run SQL query with python.
print('printing results of the SELECT statement: ')
print(' ')
select_statement = 'SELECT * FROM stock_table WHERE `Adj_Close` > 270.0;'
cur.execute(select_statement)
for line in cur.fetchall():
print(line)
# closing the database session
cur.close()
engine.close()
We can also connect to the database manually via terminal to see what is there and if the output matches with the python query:
Well, there are some rounding errors/differences in the outputs, but that is topic for another article.
Sources: