1

Data frame column text with datatype string contains sentences, I am looking to extract the rows which contains certain words irrespective of place in which they occur.

For ex:

Column
Cat and mouse are the born enemies
Cat is a furry pet


df = df[df['cleantext'].str.contains('cat' & 'mouse')].reset_index()
df.shape

The above is throwing an error.

I know that for or condition we can write -

df = df[df['cleantext'].str.contains('cat | mouse')].reset_index()

But I want to extract the rows where both cat and mouse are present

Expected Output -

Column
Cat and mouse are the born enemies
Dr.Chuck
  • 213
  • 2
  • 13

1 Answers1

0

Here's one approach, which also works for multiple words:

words = ['cat', 'mouse']
m = pd.concat([df.Column.str.lower().str.contains(w) for w in words], axis=1).all(1)
df.loc[m,:]

      Column
0  Cat and mouse are the born enemies
yatu
  • 86,083
  • 12
  • 84
  • 139