how to conditionally select dataframe rows by comparing column values with a list

Question

I have the following list and dataframe in Python:

import pandas as pd

my_list = ["a", "b", "d"]

d = {'col1': [1, 2, 3, 4], 'col2': ["a", "b", "c", "d"]}
df = pd.DataFrame(data=d)
df

Output:

    col1    col2
0   1       a
1   2       b
2   3       c
3   4       d

But I only want to have such rows in the dataframe where values of col2 also exist in my_list.

The final output is supposed to look like the following:

    col1    col2
0   1       a
1   2       b
2   4       d

How can I achieve this without using a for loop?

score 1 · Accepted Answer · answered Apr 19 '22 at 15:34

1

You can use df.apply for this

df[df['col2'].apply(lambda x: x in my_list)]

answered Apr 19 '22 at 15:34

Akash garg

score 1 · Answer 2 · answered Apr 19 '22 at 15:37

1

df.query(f"col2 in {my_list}")

answered Apr 19 '22 at 15:37

Reuben

2 Answers2