I have two dataframes, df1
:
group value
g1 A
g1 B
g1 C
g1 D
g2 B
g2 C
g2 E
g3 A
g3 D
g3 E
g4 B
g4 D
...
and df2
:
value1 value2
A B
A C
B C
B D
...
I would like to merge df1
and df2
by merging columns value
with both value1
and value2
, namely:
group value1 value2
g1 A B
g1 A C
g1 B C
g2 B C
g1 B D
g4 B D
...
As shown, group
illustrates two columns value1
and value2
only if the two values simultaneously corresponds to the same group in df1
.
df = df1.merge(df2, left_on='value', right_on='value1', how='inner')
df = df.merge(df2, left_on='value', right_on='value2', how='inner')
df = df.drop(['value'], axis=1)
The problem is, my actual dataframes df1
and df2
are quite large and every time I try to run the code, the session crashes (Not to mention, not quite sure if my code is correct to begin with).
Any idea how to optimize this process (and to confirm whether my approach is correct)?