0

I want to group my dataframe so that the rows with the same hour from timestamp column (which has data like 2019-01-01 00:00:00.134721167,50,100 where 50 is the cost, and 100 is percentage) have their cost summed and averaged, as well as percentage.

Or, to be more specific, i need to have 48 rows for 2 days of information, one for each hour, while now i have more than 500 rows. How do I do that?

Vishnudev Krishnadas
  • 10,679
  • 2
  • 23
  • 55
saturnbook
  • 15
  • 4
  • 2
    Welcome to SO! Please take a moment to read about how to post pandas questions: http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples – YOLO Jan 05 '20 at 18:48
  • Does this answer your question? [Apply multiple functions to multiple groupby columns](https://stackoverflow.com/questions/14529838/apply-multiple-functions-to-multiple-groupby-columns) – Vishnudev Krishnadas Jan 05 '20 at 19:03

1 Answers1

1

Here's a way to do:

# sample data
df = pd.DataFrame({'date': pd.date_range("2019-01-01", freq='H', periods = 10),
                  'cost': pd.np.random.randint(10, 100, 10)})

Method 1:

df.set_index('date').resample('H').sum()

Method 2:

df.groupby(pd.Grouper(key='date', freq='H'))['cost'].sum().reset_index()
YOLO
  • 20,181
  • 5
  • 20
  • 40
  • Actually, I have edited my question and I have 3-4 columns that I need to find average for. I've tried your method and it doesn't work for more than 1 column I suppose (for example, I have columns: date cost percentage value, and 2 days of data measured every 1 min; I need to get the average cost, percentage and value for every hour of the same day, meaning I will have (for 2 days) 48 rows containing each hour and average of given numbers in each hour). – saturnbook Jan 05 '20 at 22:26