5
df = df1.loc[df1['CUST_ACCT_KEY'] != df2['CUST_ACCT_KEY']]

When I execute the above command, I get the following error:

ValueError: Can only compare identically-labeled Series objects

What am I doing wrong?*

The dtypes of both the column are int64.

Ivan
  • 34,531
  • 8
  • 55
  • 100
Sumukh
  • 53
  • 1
  • 1
  • 5

1 Answers1

6

Pandas does almost all of its operations with intrinsic data alignment, meaning it uses indexes to compare, and perform operations.

You could avoid this error by converting one of the series to a numpy array using .values:

df = df1.loc[df1['CUST_ACCT_KEY'] != df2['CUST_ACCT_KEY']].values

However, you are comparing row to row with no index alignment.

MCVE:

df1 = pd.DataFrame(np.arange(1,10), index=np.arange(1,10),columns=['A'])

df2 = pd.DataFrame(np.arange(11,20), index=np.arange(11,20),columns=['B'])

df1['A'] != df2['B']

Output:

ValueError: Can only compare identically-labeled Series objects

Change to numpy array:

df1['A'] != df2['B'].values

Output:

1    True
2    True
3    True
4    True
5    True
6    True
7    True
8    True
9    True
Name: A, dtype: bool
Dadep
  • 2,796
  • 5
  • 27
  • 40
Scott Boston
  • 147,308
  • 15
  • 139
  • 187