I m having a dataframe with dates from 2006 to 2016 and for each date 7 corresponding values. The data is like below:
H PS T RH TD WDIR WSP
date
2006-01-01 11:28:00 38 988.6 0.9 98.0 0.6 120.0 14.4
2006-01-01 11:28:00 46 987.6 0.5 91.0 -0.7 122.0 15.0
2006-01-01 11:28:00 57 986.3 0.5 89.0 -1.1 124.0 15.5
2006-01-01 11:28:00 66 985.1 0.5 90.0 -1.1 126.0 16.0
2006-01-01 11:28:00 74 984.1 0.4 90.0 -1.1 127.0 16.5
2006-01-01 11:28:00 81 983.3 0.4 90.0 -1.1 129.0 17.0
I want to select few columns for each year ( for example T and RH for all 2006) . So, for each year 2006 to 2016 select a bunch of columns then write each of the new dataframes in one file. I did the following:
df_H_T=(df[['RH','T']])
mask = (df_H_T['date'] >'2016-01-01 00:00:00') & (df_H_T['date'] <='2016-12-31 23:59:59')
df_H_T_2006 =df.loc[mask]
print(df_H_T_2006.head(20))
print(df_H_T_2006.tail(20))
But is not working because it seems it doesn't know what 'date' is but then when I print the head of the dataframe it seems date is there. What am I doing wrong ?
My second question is how can I put this in a loop over the year variable so that I don t write by hand each new dataframe and select one year at a time up to 2016 ? ( I m new and never used loops in python). Thanks, Ioana