Merging DF with OR condition three common columns

Question

Read this older post updated recently How to merge dataframes based on an "OR" condition. In that example there were 2 possible columns to merge on, in this case there are three A, B and C, with C being a string.

The two df could match on any combination of A, B OR C. And each produces unique merge records on their own. Using the above example as a guide without luck.

Appreciate any help!

DF2

A	B	C	D
123	321	abc	company1
	456	rss	company2
321	432	xyz	company3
		klm	company4

DF1

A	B	C	F	G
123		abc	active	650
	456		active	754
321	432		active	820
		klm	active	840

Working Code

suff_A = ['_on_A_match_1', '_on_A_match_2', '_on_A_Match_3']
suff_B = ['_on_B_match_1', '_on_B_match_2', '_on_B_match_3']
suff_C = ['_on_C_match_1', '_on_C_match_2', '_on_C_match_3']

Get an error message on this piece too many values to unpack

DF3 = pd.concat([DF1.merge(DF2, on='A', suffixes=suff_A),
DF1.merge(DF2, on='B', suffixes=suff_B),
DF1.merge(DF2, on='C', suffixes=suff_C)])

duplicates = (DF3.A_on_B_match_1 == DF3.A_on_B_match_2 == DF3.A_on_C_match_3,
DF3.B_on_A_match_1 == DF3.B_on_A_match_2 == DF3.B_on_C_match_3,
kbd>DF3.C_on_A_match_1 == DF3.C_on_B_match_2)

DF3.loc[~duplicates]

Expected Outcome

A	B	C	D	F	G
123	321	abc	company1	active	650
	456	rss	company2	active	754
321	432	xyz	company3	active	820
		klm	company4	active	840

Show the full traceback of the error as properly formatted text (formatted as code) in the question. — Michael Butscher, Aug 07 '23 at 20:44

score 0 · Accepted Answer · answered Aug 07 '23 at 21:23

DF3 = pd.concat([
    DF1.merge(DF2, on='A'),
    DF1.merge(DF2, on='B'),
    DF1.merge(DF2, on='C')
])
print(DF3.drop_duplicates().reset_index(drop=True))

prints

      A  B_x  C_x       F    G  B_y  C_y         D  A_x    B  A_y    C
0   123       abc  active  650  321  abc  company1  NaN  NaN  NaN  NaN
1        456       active  754  456  rss  company2  NaN  NaN  NaN  NaN
2        456       active  754       klm  company4  NaN  NaN  NaN  NaN
3             klm  active  840  456  rss  company2  NaN  NaN  NaN  NaN
4             klm  active  840       klm  company4  NaN  NaN  NaN  NaN
5   321  432       active  820  432  xyz  company3  NaN  NaN  NaN  NaN
6   NaN  NaN  abc  active  650  NaN  klm  company4  123            NaN
7   NaN  NaN  klm  active  840  NaN  klm  company4                 NaN
8   NaN  NaN       active  754  NaN  rss  company2       456       NaN
9   NaN  NaN       active  820  NaN  xyz  company3  321  432  321  NaN
10  NaN       NaN  active  650  321  NaN  company1  123  NaN  123  abc
11  NaN       NaN  active  840       NaN  company4       NaN       klm

Merging DF with OR condition three common columns

1 Answers1