4

I have one dataframe.

Dataframe :

   Symbol1   BB Symbol2 CC 
0         ABC    1  ABC       1
1         PQR    1  PQR       1
2         CPC    2  CPC       0
3         CPC    2  CPC       1
4         CPC    2  CPC       2

I want to compare Symbol1 with Symbol2 and BB with CC, if they are same then I want that rows only other rows must be removed from the dataframe.

Expected Result :

Symbol1   BB Symbol2 CC 
0         ABC    1  ABC       1
1         PQR    1  PQR       1
2         CPC    2  CPC       2

If comparison between two rows then I'm using :

df = df[df['BB'] == '2'].copy()

It will work fine.

df = df[df['BB'] == df['offset'] and df['Symbol1'] == df['Symbol2']].copy()

It is giving me error.

Error :

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

How I can compare and get expected result?

ketan
  • 2,732
  • 11
  • 34
  • 80
  • 4
    Please show your error, but you should use `&` (with parenthesis) instead of `and`. – IanS Sep 19 '16 at 12:49

2 Answers2

7

You can use boolean indexing and compare with & instead and:

print ((df.Symbol1 == df.Symbol2) & (df.BB == df.CC))
0     True
1     True
2    False
3    False
4     True
dtype: bool

print (df[(df.Symbol1 == df.Symbol2) & (df.BB == df.CC)])
  Symbol1  BB Symbol2  CC
0     ABC   1     ABC   1
1     PQR   1     PQR   1
4     CPC   2     CPC   2
Community
  • 1
  • 1
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
3

Here is an alternative way, which is bit nicer, but it's also bit slower:

In [65]: df.query('Symbol1 == Symbol2 and BB == CC')
Out[65]:
  Symbol1  BB Symbol2  CC
0     ABC   1     ABC   1
1     PQR   1     PQR   1
4     CPC   2     CPC   2
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419