pandas iterrows() two data frame

Question

I'm doing something that I know that I shouldn't be doing. I'm doing a for loop within a for loop (it sounds even more horrible, as I write it down.) Basically, what I want to do, theoretically, using two dataframes is something like this:

for index, row in df_2.iterrows():
    for index_1, row_1 in df_1.iterrows():
        if row['column_1'] == row_1['column_1'] and row['column_2'] == row_1['column_2'] and row['column_3'] == row_1['column_2']:
            row['column_4'] = row_1['column_4']

There has got to be a (better) way to do something like this. Please help!

You can do a inner join on those columns, then you get a new df with all rows you want. Iterate over than if you like. — greedy52, Feb 28 '17 at 20:06
Another complication is that I may need to do some fuzzy string matching (the columns have typos). In that case, inner join is not an option for me. — kfp_ny, Feb 28 '17 at 20:41

score 1 · Accepted Answer · edited May 23 '17 at 12:09

1

As pointed out by @Andy Hayden in is it possible to do fuzzy match merge with python pandas?, you can use difflib : get_closest_matches function to create new join columns.

import difflib
df_2['fuzzy_column_1'] = df_2['column_1'].apply(lambda x: difflib.get_close_matches(x, df_1['column_1'])[0])
# Do same for all other columns

Now you can apply inner join using pandas merge function.

result_df = df_1.merge(df_2,left_on=['column_1', 'column_2','column_3'], and right_on=['fuzzy_column_1','fuzzy_column_2','fuzzy_column_3] )

You can use drop function to remove unwanted columns.

edited May 23 '17 at 12:09

Community

1
1

answered Mar 01 '17 at 01:53

Spandan Brahmbhatt

3,774
6
24
36

Thanks! I was thinking about using `fuzzwuzz` module, but this seems better. – kfp_ny Mar 03 '17 at 01:00
Sorry for the late response. I think it is working, but `get_closest_matches` forces a choice, and sometimes that isn't appropriate in my case. So I'm back to using `fuzzywuzzy`. – kfp_ny Mar 14 '17 at 18:40

pandas iterrows() two data frame

1 Answers1