I am supposed to calculate the monthly average temperatures for the entire data (i.e. for each year separately). My data contains daily logs of temperature from 1952 to 2017:
fp = "data/1091402.txt"
data = pd.read_csv(fp, skiprows= [1], sep='\s+', na_values=['-9999'] )
data['DATE_str'] = data['DATE'].astype(str)
data['DATE_month'] = data['DATE_str'].str.slice(start=0, stop=6)
data['DATE_month'] = data['DATE_month'].astype(int)
grouped_month = data.groupby('DATE_month')
I think the expected amount of months should be lower then 780 ( 65 years times 12 months) but it gives me 790 month (which for sure is not true, because my data ends in April). Problem actually starts already with years which after slicing and grouping suppose to be 65 and it gives me 66. Where did I make a mistake?