5

It's a generic question about filtering a pandas dataframe using a list. The problem is the following:

  • I have a pandas dataframe df with a column field
  • I have a list of banned fields, for example ban_field=['field1','field2','field3']
  • All elements of ban_field appear in df.field

For the moment, to retrieve the dataframe without the banned field, I proceed as follows:

for f in ban_field:
    df = df[df.field!=f]

Is there a more pythonic way to proceed (in one line?)?

Assem
  • 11,574
  • 5
  • 59
  • 97
Colonel Beauvel
  • 30,423
  • 11
  • 47
  • 87

2 Answers2

9

Method #1: use isin and a boolean array selector:

In [47]: df = pd.DataFrame({"a": [2]*10, "field": range(10)})

In [48]: ban_field = [3,4,6,7,8]

In [49]: df[~df.field.isin(ban_field)]
Out[49]: 
   a  field
0  2      0
1  2      1
2  2      2
5  2      5
9  2      9

[5 rows x 2 columns]

Method #2: use query:

In [51]: df.query("field not in @ban_field")
Out[51]: 
   a  field
0  2      0
1  2      1
2  2      2
5  2      5
9  2      9

[5 rows x 2 columns]
DSM
  • 342,061
  • 65
  • 592
  • 494
1

You can remove it by using the isin function and the negation (~) operator.

df[~df.field.isin(ban_field)]
Shashank Agarwal
  • 2,769
  • 1
  • 22
  • 24