0

I have two DataFrames of different sizes.

df1 = pd.DataFrame()
df2 = pd.DataFrame()
df1["id"] = ["A", "A", "B", "B", "C", "C"]
df1["revenue"] = [1, 2, 10, 9, 11, 14]
df2["id"] = ["A", "B", "C"]
df2["revenue"] = [12, 1, 5]

enter image description here enter image description here

I want to create a new column in df1 titled revenue_year_1 and fill it with the revenue values in df2 conditional on the id values being equal. So in df1 in both lines where the id is 'A' revenue_year_1 would be 12, where the id is 'B' the revenue_year_1 would be 1 and where the id is C the revenue_year_1 would be 5. In practice I'm doing this on a larger data set where I would not be able to do this by hand.

Desired outcome:

enter image description here

Drake
  • 25
  • 1
  • 4

2 Answers2

2

You could also merge on "id":

df1 = df1.merge(df2, on='id', suffixes=('','_year_1'))

Output:

  id  revenue  revenue_year_1
0  A        1              12
1  A        2              12
2  B       10               1
3  B        9               1
4  C       11               5
5  C       14               5
1

You can use pd.Series.map, first create a mapping series using set_index on df2:

df1['revenue_year_1'] = df1['id'].map(df2.set_index('id')['revenue'])
print(df1)

Output:

  id  revenue  revenue_year_1
0  A        1              12
1  A        2              12
2  B       10               1
3  B        9               1
4  C       11               5
5  C       14               5
Scott Boston
  • 147,308
  • 15
  • 139
  • 187