Identifying Extrema in Financial Data using Pandas

Question

I have daily S&P 500 prices and Treasury yields. The end goal is to determine how Treasuries perform, graphically and mathematically, during corrections in the S&P. A correction is a decline of some % off the last peak, with the % being a mutable parameter.

import urllib2, pandas as pd, numpy as np, matplotlib.pyplot as plt, scipy as sp

correction = 0.1    # define % decline from peak to constitute market correction

sp_data = urllib2.urlopen('http://real-chart.finance.yahoo.com/table.csv?s=%5EGSPC&a=00&b=3&c=1950&d=00&e=14&f=2016&g=d&ignore=.csv')
df1 = pd.read_csv(sp_data)
df1 = df1[['Date','Close']]
df1 = df1.rename(columns = {'Close':'S&P_500'})

t_bill_data = urllib2.urlopen('http://real-chart.finance.yahoo.com/table.csv?s=%5ETNX&a=00&b=2&c=1962&d=00&e=14&f=2016&g=d&ignore=.csv')
df2 = pd.read_csv(t_bill_data)
df2 = df2[['Date','Close']]
df2 = df2.rename(columns = {'Close':'T_Bill'})

df3 = pd.merge(df1, df2, on='Date', how='outer')

df3['Date'] = pd.to_datetime(df3['Date'], format='%Y-%m-%d')
df3 = df3.set_index('Date')

df3.describe()
df3.plot(kind='line',title='S&P 500 vs. 10 yr T-Bill',subplots=True)

How can I identify and subset the df into distinct periods of S&P corrections? (Allowing the graph plot and summary statistics to focus on unique time periods. So I can determine a correlation between S&P corrections and Treasuries.) Scipy has tools for identifying global or local minima and maxima -- is there a pythonic method to tailor these to identify periods of correction?

Sergey Bushmanov · Accepted Answer · 2021-04-12T09:04:37.090

I will answer your question from purely Pandas standpoint (rather than using urlib or numpy) as Pandas was specifically made to address almost any practical question arising in retrieving and munging financial data.

1. How to identify distinct periods of S&P corrections?

Let's define correction as a 20% or more market decline over recent (say 90 days) peak:

import pandas as pd
from pandas_datareader import data
import matplotlib.pyplot as plt
%matplotlib inline
plt.rcParams['figure.figsize'] = (15,5)

spx = data.get_data_yahoo('^GSPC', start = '1970-01-01')
tnx = data.get_data_yahoo('^TNX', start = '1970-01-01')

WINDOW = 90
CORRECTION = .2
spx_bear = spx['Close'].rolling(WINDOW).apply(lambda x: x[-1]/x.max() < (1-CORRECTION))

data_df = pd.DataFrame({'SP500': spx['Close'],
                        'Bonds': tnx['Close'],
                        'Bear market': spx_bear})

data_df.tail()


    Bear market Bonds   SP500
Date            
2016-01-11  0   2.158   1923.670044
2016-01-12  0   2.102   1938.680054
2016-01-13  0   2.066   1890.280029
2016-01-14  0   2.098   1921.839966
2016-01-15  0   2.033   1880.329956

You may play with window and correction parameters to obtain different "versions" of corrections.

2. Plotting

plot_df = data_df['2008':'2009']

_, ax = plt.subplots()
ax2 = ax.twinx()

plot_df['Bonds'].plot(ax=ax)
plot_df['Bear market'].plot(ax=ax2, style='r--', ylim=[-.1, 1.1])
ax.set_title('Treasuries Performance during SP500 Corrections');

3. Subsetting and summary statistics

Finally, there will be two ways to explore the resulting dataset: with pandas .groupby() or straightforward subsetting. In both cases we'll need returns, not prices:

ret_df = pd.DataFrame({'SP500': spx['Close'].pct_change(),
                       'Bonds': tnx['Close'].pct_change(),
                       'Bear market': spx_bear})

ret_df.groupby('Bear market').agg('mean')

    Bonds   SP500
Bear market     
0   0.000042    0.000430
1   -0.002679   -0.003261


ret_df[ret_df['Bear market'] == 1][['Bonds','SP500']].corr()
    Bonds   SP500
Bonds   1.000000    0.253068
SP500   0.253068    1.000000

Edit:

you'll see several times "bear" in the code. The reason is that I borrowed this code from my small project to identify periods of "bear markets", but this code is applicable to any correction if you disregard words "bear" and the value "-20%", which are the definition of a bear market.

hi @Sergey Bushmanov was going through your and from 5yrs ago, and it seems rolling_apply is replaced by rolling in python - so how would your code for function "bear_market" return value change? — user13412850, Apr 11 '21 at 22:37

Identifying Extrema in Financial Data using Pandas

1 Answers1

Linked