I have a pandas df with a few million rows of Item/Groups called items
:
Item Group
0001 A
0002 A
0003 B
0004 A
...
And I need to prepare items
to be merged with a df called forecast
which looks like this:
Item Group Month Forecast
0001 A 5 15
0001 A 6 16
0001 A 7 13
0002 A 5 60
0002 A 7 65
My merge will look something like:
items.merge(forecast, on=['Item', 'Group', 'Month']
So my problem is preparing items
to look something like this:
Item Group Month
0001 A 5
0001 A 6
0001 A 7
0002 A 5
0002 A 6
0002 A 7
...
So that for a given set of months (e.g 5, 6, 7
), there are len(months)
rows for each Item/Group combination, regardless of whether or not there's forecast
data for that Item/Group/Month in forecast
.
The order needs to be preserved (e.g. the three months for a given Item/Row need to be grouped together, rather than copies just appended to the end, and the items need to stay in order).
So far I've been looking into stack() as a possible option, but I haven't gotten anything to work (save for looping through the df, appending row * 2, to a new empty df as I go, which seems massively inefficient).
What's the right way to do this?