5

My dataframe has daily stock data in it:

       Date       AAPL      NFLX       INTC  
0 2008-01-02  27.834286  3.764286  25.350000    
1 2008-01-03  27.847143  3.724286  24.670000    
2 2008-01-04  25.721428  3.515714  22.670000   
3 2008-01-07  25.377142  3.554286  22.879999    
4 2008-01-08  24.464285  3.328571  22.260000  

I'd like to calculate monthly returns using the last day of each month in my df above. I'm guessing (after googling) that resample is the best way to select the last trading day of the month. But this doesn't seem to work:

df.set_index('Date')  
m1= df.resample('M')
print(m1)

get this error:

TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Index'

So I think that means the set_index isn't working?

I've also tried:

df= df.reset_index().set_index('Date')  
m1= df.resample('M')
print(m1)

But I get the same error message as above. Thanks much for your help.

DYZ
  • 55,249
  • 10
  • 64
  • 93
Eksana Stasis
  • 173
  • 1
  • 1
  • 7
  • 1
    `set_index` is not an in-place operation without the `inplace` flag. That's why your first attempt doesn't work, you aren't actually changing your dataframe. – user3483203 Jun 23 '18 at 02:07
  • 3
    Try `df.set_index('Date').resample('M').last()` – user3483203 Jun 23 '18 at 02:10
  • tried df.set_index('Date', inplace=True) df.resample('M') but still get same error. Also tried your earlier suggestion, df.set_index('Date').resample('M').last() but no luck so far – Eksana Stasis Jun 23 '18 at 02:20
  • for my imports I have import pandas as pd import numpy as np import datetime from pandas import DataFrame – Eksana Stasis Jun 23 '18 at 02:21
  • You need to first convert the index to datetimeindex. df.index = pd.to_datetime(df.index). If Date is a column, df['Date'] = pd.to_datetime(df['Date']) – Vaishali Jun 23 '18 at 02:24
  • Hi Vaishali, have those 2 lines of code in there, but while the output is showing date as the index now, it's still printing every single date, instead of the last day of the month – Eksana Stasis Jun 23 '18 at 02:47
  • can you print df.dtypes? – Vaishali Jun 23 '18 at 02:56
  • sure, it prints: AAPL float64 NFLX float64 INTC float64 dtype: object – Eksana Stasis Jun 23 '18 at 02:58
  • import pandas as pd import numpy as np import datetime from pandas import DataFrame # import stock price data df = pd.read_csv('etc) df.index = pd.to_datetime(df.index) df['Date'] = pd.to_datetime (df['Date']) df.set_index('Date', inplace=True) df.resample('M').last() print(df) – Eksana Stasis Jun 23 '18 at 03:00
  • You are converting index to datetime before setting Date column to index. Correct order should be df = pd.read_csv('etc') df.set_index('Date', inplace=True) df.index = pd.to_datetime(df.index) df = df.resample('M').last() – Vaishali Jun 23 '18 at 03:05
  • 1
    phew! that worked Vaishali, thank you so much for your patience with me! really appreciate it :-) – Eksana Stasis Jun 23 '18 at 03:13

1 Answers1

13

Your index is not a DatetimeIndex. But you can make it a DatetimeIndex:

df.set_index('Date', inplace=True)
df.index = pd.to_datetime(df.index)
df.resample('1M').mean()
#                 AAPL      NFLX    INTC
#Date                                   
#2008-01-31  26.248857  3.577429  23.566
DYZ
  • 55,249
  • 10
  • 64
  • 93