filter rows on column values with string methods

Question

Input df:

title                        desc
movie A                  It is a awesome movie with action
movie B                  Slow but intense movie.

I want to filter rows which contains the following keywords:

keys =  ["awesome", "action"]

Output DF:

title                        desc
movie A                  It is a awesome movie with action

Code:

index_list = []
for index,rows in df.iterrows():
   if any(x in rows["desc"].split(" ") for x in keys) == True:
       index_list.append(index)

df = df.loc[index_list]

Approach:

In each row, I am checking if any of the keywords are present after splitting the rows

This approach works fine, but I am interested to know if there is any one liner in pandas to achieve the same.

Example:

df.loc[df['column_name'].isin(some_values)]

Use `df.desc.str.contains('|'.join(keys))` – user3483203 Oct 09 '18 at 20:56 — user3483203, Oct 09 '18 at 20:56

score 3 · Answer 1 · answered Oct 09 '18 at 20:56

3

Why yes, there is - pandas.Series.str.contains

idx = df['column_name'].str.contains("|".join(keys), regex=True)
df[idx]

answered Oct 09 '18 at 20:56

CJR

3,916
2
10
23

score 1 · Answer 2 · answered Oct 09 '18 at 21:01

The following should do the trick for you:

>>> import pandas as pd
>>> d = {'title':['movie A', 'movie B'], 'desc':['It is a awesome movie with action', 'Slow but intense movie.']}
>>> df = pd.DataFrame(data=d)
>>> df
                                desc    title
0  It is a awesome movie with action  movie A
1            Slow but intense movie.  movie B
>>> keys =  ["awesome", "action"]
>>> df[df['desc'].str.contains('|'.join(keys))]
                                desc    title
0  It is a awesome movie with action  movie A

filter rows on column values with string methods

2 Answers2