0

I have a dataframe and column named as 'Values'.

Please find the code below to create the dataframe

df = pd.DataFrame({'Person_id':[1,2,3,4,5],
 'Values':[np.nan,np.nan,'1.Yes','2.No', np.nan],
       'Ethnicity':['1.Chinese','2.Indian','3.Malay',np.nan,np.nan]})

The dataframe looks like as shown below after executing the above piece of code

enter image description here

I have given only a sample data and this is a part of the main program.

From the above dataframe, I would like to find whether a 'Values' column of a specific row contains 'Yes' or 'No' as values using regex

For example, I would like to know whether df['Values][2] contains 'Yes' keyword/term in its value

To that, I wrote the below piece of code but am not able to get the expected output

df['Values'] = df['Values'].astype(str) 
df['Values'][2].contains('Yes|No',regex=True)

Inspite of multiple tries/variations of above code and search through SO, I am not able to get/resolve this. I am getting the below error

AttributeError: 'str' object has no attribute 'contains'

How can I get whether the value 'Yes' or 'No' is present in specific cell of a column in dataframe.

Please note that this is part of a larger program where I use for loop and indices. Hence, I would like to perform the check at cell level and get the output. Using df.isin will not be of any help

The Great
  • 7,215
  • 7
  • 40
  • 128

1 Answers1

1

Use str.contains('Yes|No',regex=True)

Ex:

import pandas as pd
import numpy as np

df = pd.DataFrame({'Person_id':[1,2,3,4,5],
 'Values':[np.nan,np.nan,'1.Yes','2.No', np.nan],
       'Ethnicity':['1.Chinese','2.Indian','3.Malay',np.nan,np.nan]})

print(df["Values"].str.contains('Yes|No',regex=True))

Output:

0     NaN
1     NaN
2    True
3    True
4     NaN
Name: Values, dtype: object
Rakesh
  • 81,458
  • 17
  • 76
  • 113