Currently working with an interesting transport smart card dataset. Each line in the current data represent a trip (e.g. bus trip from A to B). Any trips within 60 min needs to be grouped into journey.
The current table:
CustomerID SegmentID Origin Dest StartTime EndTime Fare Type
0 A001 101 A B 7:30am 7:45am 1.5 Bus
1 A001 102 B C 7:50am 8:30am 3.5 Train
2 A001 103 C B 17:10pm 18:00pm 3.5 Train
3 A001 104 B A 18:10pm 18:30pm 1.5 Bus
4 A002 105 K Y 11:30am 12:30pm 3.0 Train
5 A003 106 P O 10:23am 11:13am 4.0 Ferrie
and covert into something like:
CustomerID JourneyID Origin Dest Start Time End Time Fare Type NumTrips
0 A001 1 A C 7:30am 8:30am 5 Intermodal 2
1 A001 2 C A 17:10pm 18:30pm 5 Intermodal 2
2 A002 6 K Y 11:30am 12:30pm 3 Train 1
3 A003 8 P O 10:23am 11:13am 4 Ferrie 1
I'm new to Python and Pandas and have no idea how to start, so any guidance would be appreciated.