Here we will see how we can aggregate daily OHLC stock data into weekly time window.

Will be using pandas library to perform the resampling

df.resample('W').agg(agg_dict)
  • resample('W') means we will be using Weekly time window for aggregation
  • agg(agg_dict) takes dictionary as a parameter, the dictionary says in which way we will aggregate column data (we can use mean, min, max and so on)

Weekly resampled OHLC data can be used for further computation of indicators (such as weekly RSI or weekly moving averages).

Import the libraries

In [58]:
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))
# ___library_import_statements___
import pandas as pd
# make pandas to print dataframes nicely
pd.set_option('expand_frame_repr', False)  
import pandas_datareader.data as web
import matplotlib.pyplot as plt
import datetime
#newest yahoo API 
import yfinance as yahoo_finance
%matplotlib inline

Specify stock ticker and time range for historical data

In [59]:
# ___variables___
ticker = 'AAPL'

start_time = datetime.datetime(2017, 10, 1)
#end_time = datetime.datetime(2019, 1, 20)
end_time = datetime.datetime.now().date().isoformat()         # today

Download the data from Yahoo! Finance

In [60]:
# yahoo gives only daily historical data
def get_data(ticker, start_time, end_time):
    connected = False
    while not connected:
        try:
            df = web.get_data_yahoo(ticker, start=start_time, end=end_time)
            connected = True
            print('connected to yahoo')
        except Exception as e:
            print("type error: " + str(e))
            time.sleep( 5 )
            pass   
    
    return df        
        
# run the function
df = get_data(ticker, start_time, end_time)        
connected to yahoo
In [61]:
print(df.head(7))
print(df.tail(7))
                  High         Low        Open       Close      Volume   Adj Close
Date                                                                              
2017-10-02  154.449997  152.720001  154.259995  153.809998  18698800.0  148.757111
2017-10-03  155.089996  153.910004  154.009995  154.479996  16230300.0  149.405106
2017-10-04  153.860001  152.460007  153.630005  153.479996  20163800.0  148.437958
2017-10-05  155.440002  154.050003  154.179993  155.389999  21283800.0  150.285172
2017-10-06  155.490005  154.559998  154.970001  155.300003  17407600.0  150.198135
2017-10-09  156.729996  155.490005  155.809998  155.839996  16262900.0  150.720398
2017-10-10  158.000000  155.100006  156.059998  155.899994  15617000.0  150.778427
                  High         Low        Open       Close      Volume   Adj Close
Date                                                                              
2019-12-30  292.690002  285.220001  289.459991  291.519989  36028600.0  291.519989
2019-12-31  293.679993  289.519989  289.929993  293.649994  25201400.0  293.649994
2020-01-02  300.600006  295.190002  296.239990  300.350006  33870100.0  300.350006
2020-01-03  300.579987  296.500000  297.149994  297.429993  36580700.0  297.429993
2020-01-06  299.959991  292.750000  293.790009  299.799988  29596800.0  299.799988
2020-01-07  300.899994  297.480011  299.839996  298.390015  27218000.0  298.390015
2020-01-08  303.040009  297.156006  297.160004  302.542694  18122999.0  302.542694

Weekly timeframe aggregation

Resample using dictionary, each dataframe column will be aggregated differently. For example when we want lowest price during the weekly window, we will use 'Low': 'min'. Aggregation of other columns has similar logic.

In [62]:
agg_dict = {'Open': 'first',
          'High': 'max',
          'Low': 'min',
          'Close': 'last',
          'Adj Close': 'last',
          'Volume': 'mean'}

# resampled dataframe
# 'W' means weekly aggregation
r_df = df.resample('W').agg(agg_dict)
In [63]:
print(r_df.head(7))
print('---------')
print(r_df.tail(7))
                  Open        High         Low       Close   Adj Close      Volume
Date                                                                              
2017-10-08  154.259995  155.490005  152.460007  155.300003  150.198135  18756860.0
2017-10-15  155.809998  158.000000  155.100006  156.990005  151.832642  16260960.0
2017-10-22  157.899994  160.869995  155.020004  156.250000  151.116959  25210260.0
2017-10-29  156.889999  163.600006  155.270004  163.050003  157.693542  24480660.0
2017-11-05  163.889999  174.259995  163.720001  172.500000  166.833115  43035480.0
2017-11-12  172.369995  176.240005  171.720001  174.669998  169.539093  27685080.0
2017-11-19  173.500000  174.500000  168.380005  170.149994  165.151871  23291940.0
---------
                  Open        High         Low       Close   Adj Close        Volume
Date                                                                                
2019-12-01  262.709991  268.000000  262.500000  267.250000  267.250000  1.881758e+07
2019-12-08  267.269989  271.000000  256.290009  270.709991  270.709991  2.282996e+07
2019-12-15  270.000000  275.299988  264.910004  275.149994  275.149994  2.840588e+07
2019-12-22  277.000000  282.649994  276.980011  279.440002  279.440002  3.663600e+07
2019-12-29  280.529999  293.970001  280.369995  289.799988  289.799988  2.415238e+07
2020-01-05  289.459991  300.600006  285.220001  297.429993  297.429993  3.292020e+07
2020-01-12  293.790009  303.040009  292.750000  302.542694  302.542694  2.497927e+07

Plot daily and weekly data

In [64]:
def plot_OHLC(data, ticker, window):
    
    #Run these commands to avoid Future warning for matplotlib
    from pandas.plotting import register_matplotlib_converters
    register_matplotlib_converters()

    plt.figure(figsize=(15,5))
    plt.title('{} price data in {} timeframe'.format(ticker, window))
    plt.plot(data['Open'])
    plt.plot(data['High'])
    plt.plot(data['Low'])
    plt.plot(data['Close'])
    plt.plot(data['Adj Close'])
    plt.legend()
    plt.show()
    
def plot_volume(data, ticker, window):
    plt.figure(figsize=(15,2))
    plt.title('{} volume data in {} timeframe'.format(ticker, window))
    plt.plot(data['Volume'])
    plt.legend()
    plt.show()    

Plot price:

In [65]:
plot_OHLC(df, ticker, 'daily')
plot_OHLC(r_df, ticker, 'weekly')

Plot volume:

In [66]:
plot_volume(df, ticker, 'daily')
plot_volume(r_df, ticker, 'weekly')

Summary:

We have transformed daily stock data to weekly time window. This transformation might be benefitial for writing stock screening scripts for swing trading on larger timeframes. The weekly window should get us focused on the bigger picture and get rid of great part of the noise in the data.


The only concern is that we are aggregating on weekly time window, but stock market trades only 5 days a week. Will need to check further, maybe we will need 5 day window. But so far it looks like Pandas is clever, since weekly aggregation ends on Sunday every time. If today is Tuesday and we pass today's data into weekly aggregation, resampling shows the value for upcoming Sunday, which is exactly what we wanted. So we should be fine.