0

I could not find a solution to this problem because the length of the indexes of the two dataframe is not the same so it prevents me from using all the pd.merge, join, etc..

Here is a toy model of what I try to do:

data1= pd.DataFrame({'Supplier' : ['001', '001', '001', '001', '002', '002', '002', '003', '003', '003'],'Quantity' : [200,20,50,40,210,47,66,53,100,60]})

data2 = pd.DataFrame({'Supplier': ['001', '002', '003'], 'lead_time' : [7,7,7]})

and I need to get:

data3 = pd.DataFrame({'Supplier' : ['001', '001', '001', '001', '002', '002', '002', '003', '003', '003'],
                      'Quantity' : [200,20,50,40,210,47,66,53,100,60],
                      'lead_time' : [7,7,7,7,7,7,7,7,7,7]})

following Matching Pandas DataFrame Column Values with another DataFrame Column, I tried doing:

data1.join(data2
             .set_index("lead_time")
             .loc[:, data1["lead_time"]],
             on="lead_time")

without success, and I am out of ideas, any help on this issue would be much appreciated.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Murcielago
  • 905
  • 1
  • 8
  • 30

1 Answers1

1

I think your problem is that you are setting the wrong index. If you run the next line you get the result you want.

d4 = data1.join(data2.set_index("Supplier"), on="Supplier")

Output:

    Supplier    Quantity    lead_time
0     001       200         7
1     001       20          7
2     001       50          7
3     001       40          7
4     002       210         7
5     002       47          7
6     002       66          7
7     003       53          7
8     003       100         7
9     003       60          7
sergiomahi
  • 964
  • 2
  • 8
  • 21