4

I have three data frame. Here's my code and I am getting value error.
raise ValueError("Can only compare identically-labeled Series objects")

ValueError: Can only compare identically-labeled Series objects.

Here is my code:

df1
df2
df3
for i in df1:                 # for i from df1
     x = df2.loc[df2['col_1'] == i, 'col_2']   #looking for i in col_1 of df2 and getting coresponding value of col_2 as x.
     y = df3.loc[df3['col_1'] == x, 'col_2']   #looking for x in col_1 of df3 and getting coresponding value of col_2 as y

The first statement in for loop runs correctly but getting value error in second statement.

derloopkat
  • 6,232
  • 16
  • 38
  • 45

1 Answers1

2

Assume that your 3 DataFrames have the following content:

df1:                df2:              df3:
       Aa    Bb       col_1 col_2       col_1 col_2
0  123.15  12.6     0    Aa    Cc     0    Cc    Gg
1  137.53  28.3     1    Bb    Dd     1    Dd    Hh
                    2    Bb    Ee     2    Ee    Jj
                                      3    Ff    Kk

In the first turn of your loop i contains the name of the first column in df1, i.e. 'Aa'.

When you execute x = df2.loc[df2['col_1'] == i, 'col_2'], the result is a Series:

0    Cc
Name: col_2, dtype: object

And now, even if you attempted to execute df3['col_1'] == x, your error occurs.

Note that in this case both df3['col_1'] and x are of Series type. In this case:

  • the first thing Pandas does is to align both Series (on the index),
  • and then it would compare each pair of (aligned) elements.

But in this case:

  • df3['col_1'] contains indices 0 thru 3,
  • the index in x contains only one position - 0.

So there is alignment failure, which causes this exception.

To cope with this issue, change the offending line to:

y = df3.loc[df3['col_1'].isin(x), 'col_2']

Now Pandas operates just as you intended:

  • iterates over df3['col_1'],
  • for the current element checks whether its value is among values present in x,
  • if it is, value from col_2 in the current row is added to the result.

To demonstrate how this code works, complete it with some printouts:

for i in df1:
    print(f'\ni: {i}')
    x = df2.loc[df2['col_1'] == i, 'col_2']
    print(f'\nx:\n{x}')
    y = df3.loc[df3['col_1'].isin(x), 'col_2']
    print(f'\ny:\n{y}')

When you run the above code, on my data, the result is:

i: Aa

x:
0    Cc
Name: col_2, dtype: object

y:
0    Gg
Name: col_2, dtype: object

i: Bb

x:
1    Dd
2    Ee
Name: col_2, dtype: object

y:
1    Hh
2    Jj
Name: col_2, dtype: object
Valdi_Bo
  • 30,023
  • 4
  • 23
  • 41