I'm trying to group/aggregate my taxi trip data by time (at the granularity of an hour), month, day, year and pick up location ID.
So that my output data should have a row like 2014 04 01 1 123 375
; this is representing 375 taxi trips happened on the 1st April 2014 at 1am at the pick up location 123.
My input dataframe are:
PULocationID day month year hour
153 1 1 2014 1
122 3 12 2012 13
153 1 1 2014 1
122 3 12 2012 13
I would like these to then be grouped and look like the below with a new taxi_trips
column:
PULocationID day month year hour Taxi_Trips
153 1 1 2014 1 2
122 3 12 2012 13 2