I am trying to find if a string exists across multiple columns. I would like to return a 1 if the string exists and 0 if it doesn't as a new series within the dataframe.
After searching the forums, I understand that str.contains could be used, but i'm searching over 100+ columns therefore it isn't efficient for me to work with individual series at a time.
There are some NAs within the columns if this is relevant.
Example simplified dataframe:
d = {'strings_1': ['AA', 'AB', 'AV'], 'strings_2': ['BB', 'BA', 'AG'],
'strings_1': ['AE', 'AC', 'AI'], 'strings_3': ['AA', 'DD', 'PP'],
'strings_4': ['AV', 'AB', 'BV']}
simple_df = pd.DataFrame(data=d)
If I am interested in finding 'AA' for example, I would like to return the following dataframe.
Example target dataframe:
d = {'strings_1': ['AA', 'AB', 'AV'], 'strings_2': ['BB', 'BA', 'AG'],
'strings_1': ['AE', 'AC', 'AI'], 'strings_3': ['AA', 'DD', 'PP'],
'strings_4': ['AV', 'AB', 'BV'], 'AA_TRUE': [1, 0, 0]}
target_df = pd.DataFrame(data=d)
Many thanks for help.