Pandas: Add one column of a data frame to a second data frame without merge

Asked Nov 08 '18 at 21:52

Active Nov 08 '18 at 21:52

Viewed 34 times

My main issue is a memory error each time I try to merge two of my data frames like this:

result = df1.merge(df2[['col1','col2','col3']], on=['col1','col2'], how='left')

So I need another way to add col3 to df1 (without getting a memory error).

I found solutions using map(). But the examples always had one column as key for a mapping:

result['col3'] = df1['col1'].map(df2.set_index('col1')['col3'])

but as mentioned before, the combination of two columns identifies a row within my data frame.

My questions:

asked Nov 08 '18 at 21:52

MaMo

There are some methods here: https://stackoverflow.com/a/53215754/3279716 – Alex Nov 08 '18 at 21:55
@Alex - not really. My point is I have two columns per data frame as key (col1 and col2), not one. The post you suggested has one column as key per dataframe with different name. – MaMo Nov 08 '18 at 22:02
To map, you'll need to turn `col1` and `col2` into a tuple, and use those same tuples as keys for your dictionary. – ALollz Nov 08 '18 at 22:32
Something like: `df1['tup'] = [tuple(x) for x in df1[['col1', 'col2']].values]`, and your dictionary as `dict((tuple(x[0:2]), x[2]) for x in df2[['col1', 'col2', 'col3']].values)` – ALollz Nov 08 '18 at 22:37

0 Answers0