I have two pandas dataframes, like so:
Table A:
Country | Item | Start Date |
---|---|---|
France | Croissant | 2020-03-01 |
England | Scone | 2020-04-03 |
France | Croissant | 2020-05-01 |
England | Scone | 2020-06-03 |
Table B:
Country | Item | End Date |
---|---|---|
France | Croissant | 2020-05-02 |
England | Scone | 2020-06-04 |
France | Croissant | 2020-03-02 |
England | Scone | 2020-04-05 |
I'd like to join these two tables together so that for each country-item pairing and unique start date, the subsequent end date is appended. Like so:
Country | Item | Start Date | End Date |
---|---|---|---|
France | Croissant | 2020-03-01 | 2020-03-02 |
England | Scone | 2020-04-03 | 2020-04-04 |
France | Croissant | 2020-05-01 | 2020-05-02 |
England | Scone | 2020-06-03 | 2020-06-04 |
Right now, I'm doing the following:
- Left join of the end date data onto the start date data
- Group by the start date and collecting the end dates into a sorted list
- Finding the date within this list that is the closest subsequent date after the start date
I feel like there's got to be a better way. Any suggestions?