-1

I have this dataframe

np.random.seed(0)

start_d = '2018-01-01 00:00:00'
start_d = pd.to_datetime(start_d,format='%Y-%m-%d %H:%M:%S')

end_d = '2018-01-28 00:00:00'
end_d = pd.to_datetime(end_d,format='%Y-%m-%d %H:%M:%S')
 

index = pd.date_range(start = start_d, end = end_d)


df = pd.DataFrame(index=index,data=np.random.randint(0,100,size=(28, 2)), columns=list('AB'))

I would like to compute the correlation between the two series but on a weekly base. In other words, I am thinking about a sort of resample with a specific apply. My idea is to apply both Pearson and Spearman. To make myself clear:

df.resample('W').corr(method='spearman)

What do you think? Is it possible to do something similar?

Best.

diedro
  • 511
  • 1
  • 3
  • 15

1 Answers1

1

If I understand this question correctly, you're trying to get the correlation at a week level. Are there multiple years of dates?

If you only have one year:

# Set the week number:
df['Week_Number'] = df['Date'].dt.isocalendar().week

# Now groupby the week number, and get the correlation:
df.groupby('Week_Number')[['A', 'B']].corr()

If you have >1 year:

# Set both the week and year:
df['Week_Number'] = df['Date'].dt.isocalendar().week
df['Year'] = df['Date'].dt.year

# Now groupby the week number and year, and get the correlation:
df.groupby(['Week_Number', 'Year'])[['A', 'B']].corr()
Alexandre Daly
  • 320
  • 1
  • 7
  • 1
    Note that mixing ISO week numbers and non ISO years can result in a date at the end of a year being coded as the end of the next year: https://stackoverflow.com/questions/70734265/pandas-pivot-table-function-values-into-wrong-rows/70734667#70734667 – Nick ODell Mar 20 '23 at 20:23
  • This seems a nice solution. However, I would like to have a more general one. A sort of "dfr.resample('W').apply(corr())". I would have two advantages: Firstly, it would be extendible to my own function; and secondly, it would be more easy to handle the outcomes. I do not need indeed all the correlation matrix. The actual outcomes is difficult, at least for me. I need only one column with the correlation for each week between 'A' and 'B'. – diedro Mar 22 '23 at 09:08