Assume that your 3 DataFrames have the following content:
df1: df2: df3:
Aa Bb col_1 col_2 col_1 col_2
0 123.15 12.6 0 Aa Cc 0 Cc Gg
1 137.53 28.3 1 Bb Dd 1 Dd Hh
2 Bb Ee 2 Ee Jj
3 Ff Kk
In the first turn of your loop i contains the name of the first column
in df1, i.e. 'Aa'.
When you execute x = df2.loc[df2['col_1'] == i, 'col_2']
, the result is
a Series:
0 Cc
Name: col_2, dtype: object
And now, even if you attempted to execute df3['col_1'] == x
, your error occurs.
Note that in this case both df3['col_1'] and x are of Series type.
In this case:
- the first thing Pandas does is to align both Series (on the index),
- and then it would compare each pair of (aligned) elements.
But in this case:
- df3['col_1'] contains indices 0 thru 3,
- the index in x contains only one position - 0.
So there is alignment failure, which causes this exception.
To cope with this issue, change the offending line to:
y = df3.loc[df3['col_1'].isin(x), 'col_2']
Now Pandas operates just as you intended:
- iterates over df3['col_1'],
- for the current element checks whether its value is among values
present in x,
- if it is, value from col_2 in the current row is added to
the result.
To demonstrate how this code works, complete it with some printouts:
for i in df1:
print(f'\ni: {i}')
x = df2.loc[df2['col_1'] == i, 'col_2']
print(f'\nx:\n{x}')
y = df3.loc[df3['col_1'].isin(x), 'col_2']
print(f'\ny:\n{y}')
When you run the above code, on my data, the result is:
i: Aa
x:
0 Cc
Name: col_2, dtype: object
y:
0 Gg
Name: col_2, dtype: object
i: Bb
x:
1 Dd
2 Ee
Name: col_2, dtype: object
y:
1 Hh
2 Jj
Name: col_2, dtype: object