I have two datasets I need to combine in pandas. One includes information for a bunch of invoices and two includes supplier information. I've included a simplified example below.
Basically, I want to combine the supplier data with the invoice data on supplier ID so that each invoice has the home country of the supplier from the supplier info data.
I've tried merging and joining in different ways (inner, left, outer) and it did not repeat based on duplicate supplier IDs like I needed it to. I'm having trouble figuring out how to repeat observations. Example, Company A may appear in 2+ invoices so we need country to populate on each, but with merging it only will populate on one. The rest will be NaN. Can anyone help?
Data example:
d1 = {'company id': [11, 12, 11, 11, 13, 12, 14],
'company': ["Company A", "Company B", "Company A", "Company A", "Company C", "Company B", "Company D"],
'USD': [120000, 20000, 4000000, 99000, 8000, 110000, 443000], 'Invoice no': [1, 2, 3, 4, 5, 6, 7]}
invoices = pd.DataFrame(data=d1)
d2 = {'company id': [11, 12, 13, 14],
'company': ["Company A", "Company B", "Company C", "Company D"],
'Country': ["USA", "USA", "AUS", "MEX"]}
supplier = pd.DataFrame(data=d2)