0

I have two data frames as follows: I want to map values in df1['Data1'] to df2['Data1'] and df2['Data2']. I used following method but it is lengthier. Is there any alternate way of doing this in pandas

df1 = pd.read_excel("df1.xlsx")
df2 = pd.read_excel("df2.xlsx"

df1

Data1   Data2   Score
ABC      AB1    1
AB1      ABC    4
AB2      AB2    6
ABC      ABD    0.7
GDH      ABD    0.9
KMN      KSF    0.5
KSF      KSF    6

df2

Data1
AB1
AB2
ABC
ABD

mapped=pd.merge(df1, df2, left_on='Data1', right_on='Data1')
mappedx = pd.merge(df1, df2, left_on='Data2', right_on='Data1')
mappedx.rename(columns = {'Data1_x':'Data1'}, inplace = True)
mappedx = mappedx[['Data1','Data2','Score']]
frame = [mapped, mappedx]
result = pd.concat(frame)
result = result.drop_duplicates()

result

Data1   Data2   Score
ABC      AB1    1
AB1      ABC    4
AB2      AB2    6
ABC      ABD    0.7
GDH      ABD    0.9

1 Answers1

0

Use Series.isin for both columns chained by | for bitwise OR:

df = df1[df1['Data1'].isin(df2['Data1']) | df1['Data2'].isin(df2['Data1'])]
print (df)
  Data1 Data2  Score
0   ABC   AB1    1.0
1   AB1   ABC    4.0
2   AB2   AB2    6.0
3   ABC   ABD    0.7
4   GDH   ABD    0.9

Or use DataFrame.isin with DataFrame.any:

df = df1[df1[['Data1','Data2']].isin(df2['Data1'].tolist()).any(axis=1)]
print (df)
  Data1 Data2  Score
0   ABC   AB1    1.0
1   AB1   ABC    4.0
2   AB2   AB2    6.0
3   ABC   ABD    0.7
4   GDH   ABD    0.9
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252