0

I have three columns, date (1/1/2012.....1/31/2012), time (1:00,2:00, ..., 24:00:00), and data. The data is in hourly for the year 2012. I am trying to first sort the data month wise and time wise. Second, plot the data hourly and also as daily average. For some reason the sorting has some issues and do not sort chronologically, for e.g., start the hour with 10:00 and 1:00 comes after 19.00. Second, I am not sure how to calculate the monthly mean looping through the date column. Here is my code and appreciate your help. The data can be accessed here, https://www.dropbox.com/s/n1pcagf8lwocwht/data.csv?dl=0

import pandas as pd

data = 'data.csv'
#
df = pd.read_csv(data, encoding = '860')
df = df.sort_values(by=['date','hour'], ascending=True)
print(list(df))
# plotting time series
plt.close()    
plt.figure(figsize = [9.5,4])
plt.plot(df['data'], marker = 'o', markersize = '1')
#plt.xticks(df['date'])
plt.ylabel('date')
plt.xlabel('DATE')
plt.savefig('figure.png',dpi = 200)    
pyPN
  • 105
  • 3
  • 9
  • Please refer to this: https://stackoverflow.com/questions/28161356/sort-pandas-dataframe-by-date – Keerthana Manjunatha Feb 26 '19 at 00:33
  • Thanks! I could sort the date, but the real issue is with the `hour` column. I tried to use `df['hour'] = pd.to_datetime(df.hour, format='%H:%M')`, but when it reaches the hour `24:00:00`, the `format` cannot match the time. – pyPN Feb 26 '19 at 01:09

0 Answers0