Python Pandas Group by date using datetime data

Question

I have a column Date_Time that I wish to groupby date time without creating a new column. Is this possible the current code I have does not work.

df = pd.groupby(df,by=[df['Date_Time'].date()])

jezrael · Answer 1 · 2016-09-08T21:40:19.433

You can use groupby by dates of column Date_Time by dt.date:

df = df.groupby([df['Date_Time'].dt.date]).mean()

Sample:

df = pd.DataFrame({'Date_Time': pd.date_range('10/1/2001 10:00:00', periods=3, freq='10H'),
                   'B':[4,5,6]})

print (df)
   B           Date_Time
0  4 2001-10-01 10:00:00
1  5 2001-10-01 20:00:00
2  6 2001-10-02 06:00:00

print (df['Date_Time'].dt.date)
0    2001-10-01
1    2001-10-01
2    2001-10-02
Name: Date_Time, dtype: object

df = df.groupby([df['Date_Time'].dt.date])['B'].mean()
print(df)
Date_Time
2001-10-01    4.5
2001-10-02    6.0
Name: B, dtype: float64

Another solution with resample:

df = df.set_index('Date_Time').resample('D')['B'].mean()

print(df)
Date_Time
2001-10-01    4.5
2001-10-02    6.0
Freq: D, Name: B, dtype: float64

score 86 · Accepted Answer · edited Oct 23 '22 at 02:00

86

`resample`

df.resample('D', on='Date_Time').mean()

              B
Date_Time      
2001-10-01  4.5
2001-10-02  6.0

`Grouper`

As suggested by @JosephCottam

df.set_index('Date_Time').groupby(pd.Grouper(freq='D')).mean()

              B
Date_Time      
2001-10-01  4.5
2001-10-02  6.0

Deprecated uses of `TimeGrouper`

You can set the index to be 'Date_Time' and use pd.TimeGrouper

df.set_index('Date_Time').groupby(pd.TimeGrouper('D')).mean().dropna()

              B
Date_Time      
2001-10-01  4.5
2001-10-02  6.0

edited Oct 23 '22 at 02:00

wjandrea

28,235
9
60
81

answered Sep 08 '16 at 21:19

piRSquared

285,575
57
475
624

This is great! How do i prevent it from adding dates that there are no data for? For example if i had data for days 9/1,9/2,and 9/4 it still has 9/3 in there with NaN values. – GoBlue_MathMan Sep 09 '16 at 17:10
1

@GoBlue_MathMan Use `.dropna()` – piRSquared Sep 09 '16 at 17:11
Here, when grouping by 'hour', it adds hours that did not exist in the source file with zero values. – k.ko3n Jan 30 '19 at 16:53
You can avoid `.set_index('Date_Time')` by doing `pd.Grouper(key='Date_Time', freq='D')`. Could be useful if the index is significant. – wjandrea Oct 23 '22 at 02:22

score 5 · Answer 3 · edited Oct 23 '22 at 02:17

5

df.groupby(pd.Grouper(key='Date_Time', axis=0, freq='M')).sum()

M for month
Y for year
D for day

edited Oct 23 '22 at 02:17

wjandrea

28,235
9
60
81

answered Apr 18 '22 at 11:48

Avijit Das

63
1
5

1

Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Apr 18 '22 at 12:51

Python Pandas Group by date using datetime data

3 Answers3

`resample`

`Grouper`

Deprecated uses of `TimeGrouper`

Linked

Related

Python Pandas Group by date using datetime data

3 Answers3

resample

Grouper

Deprecated uses of TimeGrouper

Linked

Related

`resample`

`Grouper`

Deprecated uses of `TimeGrouper`