Detecting Keys in a Column of Strings

Question

I have a dictionary with key and value pairs. I also have a data frame with a column containing strings that contain the various keys. If a key appears in the column in the data frame, I'd like to append the corresponding value in the adjacent column

my_dict = {'elon' : 'is awesome', 'jeff' : 'is not so awesome, but hes ok, ig', 'mustard' : 'is gross', 'pigs' : 'can fly'}
my_dict

import pandas as pd
import numpy as np
pd.DataFrame({'Name (Key)' : ['elon musk', 'jeff bezos and elon musk', 'jeff bezos', 'she bought mustard for elon'], 'Corresponding Value(s)' : [np.nan, np.nan, np.nan, np.nan]})

Desired output:

# Desired output:

pd.DataFrame({'Name (Key)' : ['elon musk', 'jeff bezos and elon musk', 'jeff bezos', 'she bought mustard for elon'], 
              'Corresponding Value(s)' : [['is awesome'], ['is not so awesome, but hes ok, ig', 'is awesome'], ['is not so awesome, but hes ok, ig'], ['is gross', 'is awesome']]})

I am new to python, but assume there will be the apply function used in this. Or perhaps map()? Would an if statement be plausible, or is there a better way to approach this?

The pandas documentation is vast, but some web searching would have answered this. `df['New Column'] = df.replace( {'Name (Key)': my_dict} )`. — Tim Roberts, Oct 24 '22 at 23:19
@TimRoberts: I mean that your proposal won't work in the case described in question as the column 'Name (Key)' of the DataFrame does not have values which are keys in the dictionary. So string splitting and looping over its words checking if they are in the dictionary keys is necessary, so probably apply with 'if word in dictionary keys:' will be the appropriate option to create the new column. — Claudio, Oct 24 '22 at 23:41
Yeah @TimRoberts, your condescending remarks turned out to be incorrect. — user20234548, Oct 24 '22 at 23:45
@TimRoberts perhaps you should read and comprehend the words in the post before responding next time. It could help. Try it. — user20234548, Oct 24 '22 at 23:45
@user20234548 : please avoid non-factual remarks. Tim Roberts (probably in the hurry) tried to help, so don't reply to condescending remark with an own one. I suggest to delete your two comments like I will delete THIS one a bit later. — Claudio, Oct 25 '22 at 00:22

Claudio · Accepted Answer · 2022-10-25T00:02:52.190

Below an approach using .apply() for creating the additional column. In addition to if also looping over the words of Name (Key) column values is necessary to create multiple items in the lists being values of the new DataFrame column.

import pandas as pd
df = pd.DataFrame({'Name (Key)' : ['elon musk', 'jeff bezos and elon musk', 'jeff bezos', 'she bought mustard for elon']})

my_dict = {'elon' : 'is awesome', 
           'jeff' : 'is not so awesome, but hes ok, ig', 
           'mustard' : 'is gross', 
           'pigs' : 'can fly'}

def create_corr_vals_column(row_value):
    cvc_row = []
    for word in row_value.split():
        if word in my_dict:
            cvc_row.append(my_dict[word])
    return cvc_row

df['Corresponding Value(s)'] = df['Name (Key)'].apply( create_corr_vals_column )
print(df)

gives:

                    Name (Key)                           Corresponding Value(s)
0                    elon musk                                     [is awesome]
1     jeff bezos and elon musk  [is not so awesome, but hes ok, ig, is awesome]
2                   jeff bezos              [is not so awesome, but hes ok, ig]
3  she bought mustard for elon                           [is gross, is awesome]

is there a way to do this if the data frame was a data frame column was a pandas series? — user20234548, Oct 26 '22 at 03:27
Sorry, I don't understand ... data frame was a column was a series??? I suggest you try it to see yourself or ask another question about it. — Claudio, Oct 26 '22 at 11:00

Detecting Keys in a Column of Strings

1 Answers1