3

I am trying to compare dataframe column id with string value contract number

JSON file:

[
    {
        "id": "ee123-abc"
    }
]
df = pd.DataFrame.from_dict(data)
contract_num = 'ee123-abc'

if (df.id == contract_num).any():
    print("Matching")
else:
    print("Not matching")

I tried converting df.c_w_id.astype(str), but still the if condition goes to Not Matching. Please advise how to compare a dataframe column with String value.

vvazza
  • 421
  • 7
  • 21
  • Can you add sample data to your question? – mmdanziger Sep 06 '22 at 16:47
  • please provide fully reproducible python objects, it will be impossible to debug your code otherwise – mozway Sep 06 '22 at 16:48
  • See [How to make good pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) for help on how to create a [mcve] so that we can better understand your question – G. Anderson Sep 06 '22 at 16:51
  • Thanks for the suggestion. I have updated the question with sample code. – vvazza Sep 06 '22 at 16:52
  • @G.Anderson incorrect, Series == scalar will broadcast and produce a Series – mozway Sep 06 '22 at 16:55
  • 1
    @vvazza your code already works. I get 'Matching' – mozway Sep 06 '22 at 16:56
  • I also get a matching value, but you can also try `df.iloc[0,0]` to pull the exact string. – Stu Sztukowski Sep 06 '22 at 16:56
  • I tried comparing them, it resulted in False ```>>> df.id == contract_num 0 False Name: id, dtype: bool``` – vvazza Sep 06 '22 at 16:57
  • @G.Anderson - The code actually failed with the error message -```ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()``` – vvazza Sep 06 '22 at 17:01
  • In the sample you provided, the result is a match because, as @mozway very correctly pointed out, the contract number produces a true when it is encountered in the first row and `any()` is therefore true. However, if the actual df values don't match the string exactly (e.g., leading or trailing whitespace, it's part of a larger string, etc) then you would need `df['id'].str.contains(contract_num)` instead – G. Anderson Sep 06 '22 at 17:02
  • Thank you ! ```df.id.str.contains(contract_num).any()``` did work. But in the next step, I am trying to fetch the matching row using ```df[df.id == contract_num]``` which results in empty value. – vvazza Sep 06 '22 at 17:08

1 Answers1

0

So you have the following:

import numpy as np
import pandas as pd

json = [
    {
        "id": "ee123-abc"
    }
]

df = pd.DataFrame(json)

contract_num = 'ee123-abc'

print(df)
>>> [{'id': 'ee123-abc'}]

This is the condition you need:

print(not df[df['id'] == contract_num].empty)
>>> True

Therefore:

if not df[df['id'] == contract_num].empty:
    print("Matching")
else:
    print("Not matching")

I hope it helps :)

Janikas
  • 418
  • 1
  • 8