I have an address column in a dataframe. I need to extract the state from the address. Since address is not always in unique format, I have created a list containing all states. If address contain any string in the list, retrieve that string.
state_list=['Punjab','Kerala','Orissa']
location_list=['adr1, Orissa','adr2, Punjab','ad3, ppp','adr4: Kerala']
df=pd.DataFrame(location_list, columns=['location'])
expected output:
location state
adr1, Orissa\ Orissa
adr2, Punjab\ Punjab
ad3, ppp\ nan
adr4: kerala\ Kerala
Code tried:
any(t in x for x in location_list for t in state_list)
df[df['location'].str.contains('Punjab')] # this works for single state