I have a data frame in Datetime format which looks as follows:
def read_station_data( fileName ):
'''Read the contents of the soil moisture data file into a Pandas DataFrame
where the index is the observation date. Return the dataframe. '''
data = pd.read_csv(fileName, sep='\t', index_col=0, skiprows=2)
data.index = pd.to_datetime(data.index)
return data
Outputof this look like:
Sta 0-10cm 10-30 30-50 ... 130-150 150-170 170-190 190-200
Date ...
1981-02-19 1 37.10 74.15 79.53 ... 73.07 71.67 58.49 27.99
1981-02-24 1 33.28 69.96 76.91 ... 71.74 70.15 57.41 28.33
1981-03-02 1 32.37 66.66 73.27 ... 74.85 73.16 59.72 29.18
1981-03-09 1 31.97 64.64 71.31 ... 72.09 71.84 57.64 28.86
1981-03-17 1 26.23 63.04 70.06 ... 72.89 72.13 58.10 28.71
... ... ... ... ... ... ... ... ...
2004-06-30 5 31.72 69.89 73.18 ... 60.34 56.52 54.19 27.04
2004-06-30 11 33.35 58.07 62.65 ... 78.06 77.20 74.69 38.24
2004-06-30 13 27.16 52.77 59.70 ... 86.54 81.86 74.03 39.80
2004-06-30 15 23.94 60.69 76.37 ... 67.09 70.22 81.64 41.20
2004-06-30 82 23.66 41.70 67.54 ... 72.18 73.12 78.96 41.20
[8068 rows x 12 columns]
Now I added few more columns to it:
def compute_total_moisture( DataDF ):
'''Sum the soil moisture per soil column, which has been measured as
depth of water, so can simply be added together. Also compute the
volumetric water content of the total soil column, by dividing by the
total depth (2000 mm) and multiplying by 100%. Return the original
dataframe with two additional columns called 'Total Water Depth (mm)'
and 'Total VWC (%)'.'''
DataDF['Total Water Depth (mm)'] = DataDF.iloc[:,1:12].sum(axis=1)
DataDF['Total VWC (%)'] = (DataDF['Total Water Depth (mm)']/2000)*100
return DataDF
Now I want to compute from this data annual average values for Total Water Depth grouping by "Sta" and then resample the data annually and sum "Total water depth (mm)".
def compute_average_moisture_by_station( DataDF, MetaDF ):
'''Compute the annual average total soil moisture as a depth and as VWC
for each station. Add as columns to a copy of the station info dataframe.
Also compute the annual seasonal average VWC for each station and add to
the same new dataframe. Returned dataframe has all of the original columns
from the station information file, plus two columns for annual average total
soil moisture, and four columns for annual average seasonal VWC.'''
metaDF_copy = MetaDF.copy()
newDF = DataDF.copy() # copy dataframe
newDF = newDF.groupby('Sta') # group dataframe elements by station
newDF.index = pd.to_datetime(newDF.index) <<<---- ERROR IS COMING FROM THIS LINE
# annual total water depth
metaDF_copy['Annual Total Water Depth (mm)'] = newDF.resample("A(S)-SEP")['Date'].sum(['Total
Water Depth (mm)'])
I am getting an error of:
error: 'DataFrameGroupBy' object has no attribute 'index',
For reference how the metaDF dataframe looks:
Name Code Lat Lon Altitude
No.
1 Bondville BVL 40.05 -88.22 213
2 Dixon Springs-Bare DXB 37.45 -88.67 165
3 Brownstown BRW 38.95 -88.95 177
4 Orr Center (Perry) ORR 39.80 -90.83 206
5 De Kalb DEK 41.85 -88.85 265
6 Monmouth MON 40.92 -90.73 229
8 Peoria ICC 40.70 -89.52 207
9 Springfield LLC 39.52 -89.62 177
10 Belleville FRM 38.52 -89.88 133
11 Carbondale SIU 37.72 -89.23 137
12 Olney OLN 38.73 -88.10 134
13 Freeport FRE 42.28 -89.67 265
14 Rend Lake (Ina) RND 38.13 -88.92 130
15 Stelle STE 40.95 -88.17 213
16 Topeka MTF 40.30 -89.90 152
17 Oak Run OAK 40.97 -90.15 229
34 Fairfield FAI 38.38 -88.38 136
81 Champaign CMI 40.08 -88.23 219
82 Dixon Springs-Grass DXG 37.45 -88.67 165