I have a dataframe like so:
id familyHistoryDiabetes
0 YES - Father Diabetic
1 NO FAMILY HISTORY OF DIABETES
2 Yes-Mother & Father have Type 2 Diabetes
3 NO FAMILY HISTORY OF DIABETES
I would like to replace the column values with a simple 'yes' if the string contains 'yes' and 'no' if the string contains 'no'.
To do this I ran the following code:
df['familyHistoryDiabetes'] = df['familyHistoryDiabetes'].apply(lambda x: 'Yes' if 'Yes' in x else 'No')
After running this I realised that this would miss cases where 'yes' was all uppercase:
id familyHistoryDiabetes
0 No
1 No
2 Yes
3 No
So I want to run similar code but to ignore case of 'yes' when searching for it.
To do this I tried a solution like the one mentioned here using casefold() like so:
df['familyHistoryDiabetes'] = df['familyHistoryDiabetes'].apply(lambda x: 'Yes' if 'YES'.casefold() in map(str.casefold, x) else 'No')
But this did not work as it resulted in my dataframe becoming:
id familyHistoryDiabetes
0 No
1 No
2 No
3 No
I can imagine this is an easy fix but I am out of ideas!
Thanks.