0

so i can do something like:

data = df[ df['Proposal'] != 'C000' ]

to remove all Proposals with string C000, but how can i do something like:

data = df[ df['Proposal'] not in ['C000','C0001' ]

to remove all proposals that match either C000 or C0001 (etc. etc.)

rafaelc
  • 57,686
  • 15
  • 58
  • 82
yee379
  • 6,498
  • 10
  • 56
  • 101

2 Answers2

1

You can try this,

df = df.drop(df[df['Proposal'].isin(['C000','C0001'])].index)

Or to select the required ones,

df = df[~df['Proposal'].isin(['C000','C0001'])]
E. Zeytinci
  • 2,642
  • 1
  • 20
  • 37
0
import numpy as np
data = df.loc[np.logical_not(df['Proposal'].isin({'C000','C0001'})), :]
# or
data = df.loc[              ~df['Proposal'].isin({'C000','C0001'}) , :]
S.V
  • 2,149
  • 2
  • 18
  • 41
  • Can you explain how your answer works? – rassar Dec 07 '18 at 22:07
  • `isin` checks if values of the Series is in some set (aka 'in'), `np.logical_not` or `~` negate it (aka 'not in'), and `loc` selects rows of a DataFrame using boolean array. – S.V Dec 07 '18 at 22:15