I am trying to create a new column based on another column. specifically if it contains a certain value.
I have done the following:
df['region'] = np.where(df['location'].str.contains("AK| AZ | CA | CO | HI |ID | MT | NM | NV | OR | UT | WA | WY", na=False), "west",
np.where(df['location'].str.contains("PA | NJ | NY | VT | NH | MA | RI | CT | ME", na=False), "northwest",
np.where(df['location'].str.contains("AR | AL | DC | DE | FL | GA | KY | LA | MD | MS | NC | OK | SC | VA | WV", na=False), "south",
np.where(df['location'].str.contains("IA | IL | IN | KS |MI | MN |MO | ND |NE | OH | SD | WI", na=False), "midwest", "international"))))
I am getting this:
location region
Columbia, MO international
Maplewood, NJ international
expected:
location region
Columbia, MO midwest
Maplewood, NJ northwest
I basically have a column 'location', I want to check if it contains one of the abbreviations and then create a new column for the region.
Thank you!