I would like to join three dataframes of the following structure:
january_df=pd.DataFrame({
'January':[4,4,3,2,1,1],
'Product_no':['B1','B2','S1','S2','B3','T1'],
'Label':['Ball','Bikini','Shoe','Shirt','Bag','Towel'],
'ID':[1000, 1001, 1002, 1003, 1004, 1005],
})
february_df=pd.DataFrame({
'February':[4,3,3,2,1,1],
'Product_no':['S1','B2','B1','T1','S2','B3'],
'Label':['Shoe','Bikini','Ball','Towel','Shirt','Bag'],
'ID':[1002, 1001, 1000, 1005, 1003, 1004],
})
march_df=pd.DataFrame({
'March':[5,1,1,1,1,1],
'Product_no':['T1','E1','S1','B3','L1','B1'],
'Label':['Towel','Earring','Shoe','Bag','Lotion','Ball'],
'ID':[1005, 1006, 1002, 1004, 1007, 1000],
})
The desired output for March should be:
January February March Product_no Label ID
---------------------------------------------------------
01 1 2 5 T1 Towel 1005
02 0 0 1 E1 Earring 1006
03 3 4 1 S1 Shoe 1002
04 1 1 1 B3 Bag 1004
05 0 0 1 L1 Lotion 1006
06 4 3 1 B1 Ball 1000
In a first step I tried to merge March and February
all_df = pd.merge(march_df, february_df, on="ID")
but it does not yield the result for the two months. I tried to understand the hints on Performant cartesian product (CROSS JOIN) with pandas and pandas three-way joining multiple dataframes on columns but did not get any wiser.
In R it can be achieved as a "piped multiple join"
threeMonths <- February%>%
right_join(March)%>%
left_join(January)
which I cannot seem to translate into Python.
How do I get the output as wanted?