0

I have the following dataframe:

       YEARMODA  TEMP   MAX   MIN
0      19730701  74.5  90.0  53.6
1      19730702  74.5  88.9  57.9
2      19730703  81.7  95.0  63.0
3      19730704  85.0  95.0  65.8
4      19730705  85.0  97.9  63.9

How do I get the date to datetimelike. I want to get the average and standard deviation of the temp by year and by month. I know how to use group, it's just working with YEARMODA that is the problem

Chukas Ebuka
  • 59
  • 1
  • 6
  • 1
    sorry what exactly is the problem? you want to convert `YEARMODA` in to a datetime? – gold_cy Feb 15 '19 at 20:34
  • yes. I want to convert YEARMODA in to a datetime – Chukas Ebuka Feb 15 '19 at 20:34
  • Have you looked at: https://stackoverflow.com/questions/26763344/convert-pandas-column-to-datetime and you can get the formatting strings at https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior – ALollz Feb 15 '19 at 20:35
  • Yes, I have tried that and I kept getting "ValueError: time data 19730701 does not match format '%d%b%Y:%H:%M:%S.%f' (match)" – Chukas Ebuka Feb 15 '19 at 20:38

1 Answers1

0

Here are two ways to solve this, take your pick

df['YEARMODA'] = pd.to_datetime(df['YEARMODA'], format='%Y%m%d')

    YEARMODA  TEMP   MAX   MIN
0 1973-07-01  74.5  90.0  53.6
1 1973-07-02  74.5  88.9  57.9
2 1973-07-03  81.7  95.0  63.0
3 1973-07-04  85.0  95.0  65.8
4 1973-07-05  85.0  97.9  63.9

--------------------------------------------------------------------

from functools import partial

p = partial(pd.to_datetime, format='%Y%m%d')

df['YEARMODA'] = df['YEARMODA'].apply(p)

    YEARMODA  TEMP   MAX   MIN
0 1973-07-01  74.5  90.0  53.6
1 1973-07-02  74.5  88.9  57.9
2 1973-07-03  81.7  95.0  63.0
3 1973-07-04  85.0  95.0  65.8
4 1973-07-05  85.0  97.9  63.9

Edit: The issue you are having is you are not providing the correct format to your pd.to_datetime expression hence it is failing.

Edit 2: To get the std by month according to how you want to do it you would do it as such.

df.groupby(df.YEARMODA.apply(p).dt.strftime('%B')).TEMP.std()


YEARMODA
July    5.321936
Name: TEMP, dtype: float64


df.assign(temp=pd.to_datetime(df['YEARMODA'], format='%Y%m%d') \
  .dt \
  .strftime('%B')) \
  .groupby('temp') \
  .TEMP \
  .std()

temp
July    5.321936
Name: TEMP, dtype: float64
gold_cy
  • 13,648
  • 3
  • 23
  • 45
  • How do I sort this monthStd = data.groupby(data['Date'].dt.strftime('%B'))['TEMP'].std() by month of the year not by alphabetical order – Chukas Ebuka Feb 15 '19 at 21:52
  • April 9.018825 August 5.630849 December 8.293874 February 8.593250 January 8.221079 July 5.530319 June 9.193297 March 8.056263 May 9.363602 November 8.559405 October 8.234723 September 8.065359 – Chukas Ebuka Feb 15 '19 at 23:04
  • It is in alphabetical order. I want it to be in date order. like January , February , March , April , March – Chukas Ebuka Feb 15 '19 at 23:06
  • that's for you to figure out since that is outside the scope of this question. I answered the question you asked here in completeness. – gold_cy Feb 15 '19 at 23:14