ValueError: Can only compare identically-labeled Series objects python

Question

df = df1.loc[df1['CUST_ACCT_KEY'] != df2['CUST_ACCT_KEY']]

When I execute the above command, I get the following error:

ValueError: Can only compare identically-labeled Series objects

What am I doing wrong?*

The dtypes of both the column are int64.

The indexed of df1 and df1 are both resetted before performing the action — Sumukh, Jun 27 '18 at 16:25

score 6 · Answer 1 · edited Sep 24 '18 at 06:41

6

Pandas does almost all of its operations with intrinsic data alignment, meaning it uses indexes to compare, and perform operations.

You could avoid this error by converting one of the series to a numpy array using .values:

df = df1.loc[df1['CUST_ACCT_KEY'] != df2['CUST_ACCT_KEY']].values

However, you are comparing row to row with no index alignment.

MCVE:

df1 = pd.DataFrame(np.arange(1,10), index=np.arange(1,10),columns=['A'])

df2 = pd.DataFrame(np.arange(11,20), index=np.arange(11,20),columns=['B'])

df1['A'] != df2['B']

Output:

ValueError: Can only compare identically-labeled Series objects

Change to numpy array:

df1['A'] != df2['B'].values

Output:

1    True
2    True
3    True
4    True
5    True
6    True
7    True
8    True
9    True
Name: A, dtype: bool

edited Sep 24 '18 at 06:41

Dadep

2,796
5
27
40

answered Jun 27 '18 at 16:32

Scott Boston

147,308
15
139
187

The indexes of both df1 and df2 were resetted before I perfomed the task. I also tried numpy.where, but yielded the same result – Sumukh Jun 27 '18 at 16:40
Can you post some data along with code in the question. – Scott Boston Jun 27 '18 at 17:01
I made use of my unique indexes to filter the data. Didn't need to reset the data - df.drop(df_subset.index) – Sumukh Jun 27 '18 at 17:03

ValueError: Can only compare identically-labeled Series objects python

1 Answers1

Linked