4

I have a dataframe and i want to calculate every 12 hours mean for all the columns. dataframe has more than 200k rows.

          DateTime Speed   TRQ         ...    PtoP3  RMS3   Crest3
0       2016-07-01 00:00   994  35.4   ...       NA    NA       NA
1       2016-07-01 00:01   995  34.6   ...       NA    NA       NA
2       2016-07-01 00:02   995    34   ...       NA    NA       NA

i wrote this

Present_data.to_datetime(Present_data['DateTime'])
Total_12hravg_all = Present_data.groupby(pd.Grouper(freq='12H', key='DateTime')).mean()
print(Total_12hravg_all)

and got this error

TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Index'

James Z
  • 12,209
  • 10
  • 24
  • 44
sharma_re
  • 87
  • 7

1 Answers1

1

If Datetime is column:

Your solution shoud be working nice:

Present_data['DateTime'] = pd.to_datetime(Present_data['DateTime'])
Total_12hravg_all = Present_data.groupby(pd.Grouper(freq='12H', key='DateTime')).mean()

Alternative solution is use resample with parameter on:

Present_data['DateTime'] = pd.to_datetime(Present_data['DateTime'])
Total_12hravg_all = Present_data.resample('12H', on='DateTime').mean()

Or create DatetimeIndex:

Present_data['DateTime'] = pd.to_datetime(Present_data['DateTime'])
Present_data = Present_data.set_index('DateTime')
Total_12hravg_all = Present_data.groupby(pd.Grouper(freq='12H')).mean()
#resample
#Total_12hravg_all = Present_data.resample('12H').mean()

If Datetime is index:

Present_data.index = pd.to_datetime(Present_data.index)

Total_12hravg_all = Present_data.groupby(pd.Grouper(freq='12H')).mean()
#resample
#Total_12hravg_all = Present_data.resample('12H').mean()

Final solution:

Present_data['DateTime'] = pd.to_datetime(Present_data['DateTime'])
Present_data = Present_data.set_index('DateTime')

#convert non numeri values to NaNs
Present_data = Present_data.apply(lambda x: pd.to_numeric(x, errors='coerce'))

Total_12hravg_all = Present_data.groupby(pd.Grouper(freq='12H')).mean()
#resample
#Total_12hravg_all = Present_data.resample('12H').mean()
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252