1

I have a dataframe like so:

id      familyHistoryDiabetes
0       YES - Father Diabetic
1       NO FAMILY HISTORY OF DIABETES
2       Yes-Mother & Father have Type 2 Diabetes
3       NO FAMILY HISTORY OF DIABETES

I would like to replace the column values with a simple 'yes' if the string contains 'yes' and 'no' if the string contains 'no'.

To do this I ran the following code:

df['familyHistoryDiabetes'] = df['familyHistoryDiabetes'].apply(lambda x: 'Yes' if 'Yes' in x else 'No')

After running this I realised that this would miss cases where 'yes' was all uppercase:

id      familyHistoryDiabetes
0       No
1       No
2       Yes
3       No

So I want to run similar code but to ignore case of 'yes' when searching for it.

To do this I tried a solution like the one mentioned here using casefold() like so:

df['familyHistoryDiabetes'] = df['familyHistoryDiabetes'].apply(lambda x: 'Yes' if 'YES'.casefold() in map(str.casefold, x) else 'No')

But this did not work as it resulted in my dataframe becoming:

id      familyHistoryDiabetes
0       No
1       No
2       No
3       No

I can imagine this is an easy fix but I am out of ideas!

Thanks.

sums22
  • 1,793
  • 3
  • 13
  • 25

3 Answers3

3

Try with np.where with contains with case = False

df['new'] = np.where(df['familyHistoryDiabetes'].str.contains('Yes',case = False),
                     'Yes', 
                     'No')
BENY
  • 317,841
  • 20
  • 164
  • 234
1

With str.extract:

df["familyHistoryDiabetes" ] = df["familyHistoryDiabetes"].str.lower().str.extract("(yes|no)")

>>> df
   id familyHistoryDiabetes
0   0                   yes
1   1                    no
2   2                   yes
3   3                    no
not_speshal
  • 22,093
  • 2
  • 15
  • 30
1

You can use str.extract with IGNORECASE flag:

# regex.IGNORECASE = 2
df['new'] = df.familyHistoryDiabetes.str.extract('(Yes)', 2).fillna('No')

Output:

   id                     familyHistoryDiabetes  new
0   0                     YES - Father Diabetic  YES
1   1             NO FAMILY HISTORY OF DIABETES   No
2   2  Yes-Mother & Father have Type 2 Diabetes  Yes
3   3             NO FAMILY HISTORY OF DIABETES   No
Quang Hoang
  • 146,074
  • 10
  • 56
  • 74