I just have a column "methods_discussed" in CSV (link is https://github.com/pandas-dev/pandas/files/3496001/multiple_responses.zip) file having values name of family plaaning methods like:
methods_discussed
emergency
female_sterilization
male_sterilization
iud
NaN
injectables male_condoms
male_condoms
female_sterilization male_sterilization
injectables
iud male_condoms
I used df1["methods_discussed"].str.contains(pat = method)
but output is not matching as expected. Probably male_sterilization is substring of female_sterilization and it shows TRUE for male_sterilization. It is shown below in Actual output at index2. It must show FALSE as female_sterilization is in method_discussed column at index2.
created list of 8 family planning methods
method_names = ['female_condoms', 'emergency', 'male_condoms', 'pill', 'injectables', 'iud', 'male_sterilization', 'female_sterilization']
for method in method_names:
df1[method]=df1["methods_discussed"].str.contains(pat = method)
df1.head(2)
Expected Output
id | methods_discussed | female_condoms | emergency | male_condoms | pill | injectables | iud | male_sterilization | female_sterilization
1 | emergency | FALSE | TRUE | FALSE | FALSE | FALSE | FALSE | FALSE | FALSE
2 | female_sterilization | FALSE | FALSE | FALSE | FALSE | FALSE | FALSE | ***FALSE*** | TRUE
Actual output
id | methods_discussed | female_condoms | emergency | male_condoms | pill | injectables | iud | male_sterilization | female_sterilization
1 | emergency | FALSE | TRUE | FALSE | FALSE | FALSE | FALSE | FALSE | FALSE
2 | female_sterilization | FALSE | FALSE | FALSE | FALSE | FALSE | FALSE | ***TRUE*** | TRUE
No error in code but only in the output