-1

I have a data set and it consists only Year, days and hours and would like to get or group by into days into the month of the year?

Head of the Dataframe looks like this:

    Year    DOY     Hour
0   2012    1   1
1   2012    1   2
2   2012    1   3
3   2012    1   4
4   2012    1   5

Tail looks like this:

    Year    DOY     Hour
8779    2012    366     20
8780    2012    366     21
8781    2012    366     22
8782    2012    366     23
8783    2013    1   0

I tried groupby on DOY and Hour but that doesn't give me what I want. How can I do it in Pandas to get months out of this data?

Omid Nikrah
  • 2,444
  • 3
  • 15
  • 30
MKJ
  • 499
  • 1
  • 7
  • 20
  • 4
    Can you create [minimal, complete, and verifiable example](http://stackoverflow.com/help/mcve) ? – jezrael Oct 08 '18 at 08:56
  • Yes i added the first 5 rows of the data frame – MKJ Oct 08 '18 at 09:10
  • @MKJ, No that's not a [mcve]. Don't use images or links, see [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) if you need help. – jpp Oct 08 '18 at 09:11
  • @MKJ What is the maximum value in `DOY` column? Is it 31 or 365? – Anna Iliukovich-Strakovskaia Oct 08 '18 at 09:15
  • I hope it is a better example now. It is not 365 but 365 days and 24 hours for each day. the shape of the data frame now is 8784 rows. – MKJ Oct 08 '18 at 09:41

1 Answers1

1

To add a new date column with month (like 2012-10-22) you can try:

df['Date'] = pd.to_datetime(df.Year, format='%Y') + pd.to_timedelta(df.DOY, unit='days')

Then you can group by Date and Hour.

Example input:

df = pd.DataFrame({'Year':[2012, 2012, 2012, 2012, 2013, 2013, 2013],
                  'DOY': [1, 40, 55, 110, 1, 77, 340]})

Example output:

0   2012-01-02
1   2012-02-10
2   2012-02-25
3   2012-04-20
4   2013-01-02
5   2013-03-19
6   2013-12-07
Name: Month, dtype: datetime64[ns]
  • That was it. I just needed to subtract 1 from df.DOY to get the correct output. – MKJ Oct 08 '18 at 11:44