0

I have the following list and dataframe in Python:

import pandas as pd

my_list = ["a", "b", "d"]

d = {'col1': [1, 2, 3, 4], 'col2': ["a", "b", "c", "d"]}
df = pd.DataFrame(data=d)
df

Output:

    col1    col2
0   1       a
1   2       b
2   3       c
3   4       d

But I only want to have such rows in the dataframe where values of col2 also exist in my_list.

The final output is supposed to look like the following:

    col1    col2
0   1       a
1   2       b
2   4       d

How can I achieve this without using a for loop?

edn
  • 1,981
  • 3
  • 26
  • 56

2 Answers2

1

You can use df.apply for this

df[df['col2'].apply(lambda x: x in my_list)] 
Akash garg
  • 125
  • 6
1
df.query(f"col2 in {my_list}")
Reuben
  • 467
  • 3
  • 9