I am searching something the equivalent of a SQL merge using where "t1.A = t2.A OR t1.B = t2.A" OR t1.C = t2.A. I have two data frames say D1 with A, B, C, D, E as columns and D2, where few records of D2 can be pulled by A column of D1, few are from its alias B, C, D and E columns.
I tried as below but it was giving me the wrong output.
sample = D1.merge(D2,left_on=[ 'A' or'B' or'C'or 'D' or E],
right_on=['A'], how='left')
Then I tried
sample = pd.concat([D1.merge(D2,left_on='A', right_on= 'A', how='left'),
D1.merge(D2,left_on='B', right_on='A', how='left'), D1.merge(D2,
left_on='C',right_on='A', how='left'),D1.merge(D2,left_on='D',
right_on='A', how='left'),D1.merge(D2,left_on='E', right_on='A',
how='left')])
This is giving me a lot of duplicates I tried to remove duplicate but unfortunately, it didn't work out.
dupes = (sample['A'] == sample['B']) == (sample['C'] == sample['D']) ==
sample['E']
sample=sample.loc[~dupes]
ValueError: The truth value of a Series is ambiguous. Use a.empty,
a.bool(), a.item(), a.any() or a.all().
I need the output or 'sample' records to be same as records of data frame D1.