0

I have two different Dataframes, a long dataframe (df1) and a short dataframe (df2). Both data frames contain the columns A and B:

Data1 = {'A': [2,2,2,1,2,1,1,2], 'B': [1,2,1,2,1,1,1,2]}
Data2 ={'A': [1,1,2,2],'B': [1,2,1,2],'X': [9,5,7,3]}

df1 = pd.DataFrame(Data1)
df2 = pd.DataFrame(Data2)

print(df1)
print(df2)

   A  B
0  2  1
1  2  2
2  2  1
3  1  2
4  2  1
5  1  1
6  1  1
7  2  2
   A  B  X
0  1  1  9
1  1  2  5
2  2  1  7
3  2  2  3

The df2 data frame contains a new column called X that contains some values.

I need to create a third data frame (df3), that includes a new ‘X’ column in the data frame df1. This X column should show the values assigned in df2 depending on A and B column values. This should be the result:

Df3:
   A  B  X
0  2  1  7
1  2  2  3
2  2  1  7
3  1  2  5
4  2  1  7
5  1  1  9
6  1  1  9
7  2  2  3

I have tried different ways of merging the dataframes without success. Any help would be greatly appreciated.

Martingale
  • 511
  • 1
  • 6
  • 15

1 Answers1

1

Just merge the two on A and B. I've added how='left' to join on the index of df1 (if this is removed, it still works but returns the new df sorted differently).

df1.merge(df2, on=['A', 'B'], how='left')

Output:

   A  B  X
0  2  1  7
1  2  2  3
2  2  1  7
3  1  2  5
4  2  1  7
5  1  1  9
6  1  1  9
7  2  2  3
Emi OB
  • 2,814
  • 3
  • 13
  • 29