Without dropna()
Although this first solution does not use the dropna()
method it accomplishes the same goal cleanly.
An example dataframe based from your description:
import numpy as np
import pandas as pd
In [1]: state_names = ["Alaska", "Alabama"] # substrings to find
df = pd.DataFrame(data={'place': ["10km WNW of Progreso, Mexico","26km S of Redoubt Volcano, Alaska"]})
df
Out [1]:
place
0 10km WNW of Progreso, Mexico
1 26km S of Redoubt Volcano, Alaska
The idea is to slice using the pandas string and the regex capability, specifically joining the list states with '|' which in regex terms means "Alaska" or "Alabama" ... This allows us to
only select rows with a state as a substring in the place column:
In [2]: df = df[df.place.str.contains('|'.join(state_names))]
df
Out [2]:
place
1 26km S of Redoubt Volcano, Alaska
With dropna()
This is one way to use dropna()
to accomplish this although I find the first method simpler as this involves the extra (and unneeded) step of assigning null values to the cells that do not contain a state and then drop rows based on the "place" columns' null values:
In [3]: df.loc[~df.place.str.contains('|'.join(state_names), na=False), 'place'] = np.nan
df.dropna(subset=['place'])
Out [3]:
place
1 26km S of Redoubt Volcano, Alaska