0

I am attempting to drop rows from my dataframe that do meet a set of conditions. However it doesn't seem to be working.

Below are the two version which I have tried so far without success:

Attempt 1

df = df.drop(df[(df['Factorization'] != 0.5) & (df['Value'] != 30) & (df['Total'] == None)].index)

Attempt 2

df.drop(df[(df['Factorization'] != 0.5) & (df['Value'] != 20) & (df['Total'] == None)].index, inplace = True)

Please can someone point out where I am going wrong.

zabop
  • 6,750
  • 3
  • 39
  • 84
windwalker
  • 359
  • 4
  • 14
  • Please provide an example which is reproducible. – zabop Aug 27 '20 at 14:28
  • 3
    Does this answer your question? [Drop rows on multiple conditions in pandas dataframe](https://stackoverflow.com/questions/52456874/drop-rows-on-multiple-conditions-in-pandas-dataframe) – zabop Aug 27 '20 at 14:29
  • what is the error you get? – drops Aug 27 '20 at 14:30
  • I don'y actually get an error. I run print(len(df.index)) to check the number of rows after the code and it is there I notice that the rows number has remained the same as before running df.drop – windwalker Aug 27 '20 at 14:33
  • [This](https://stackoverflow.com/help/minimal-reproducible-example) is useful. – zabop Aug 27 '20 at 14:34
  • https://stackoverflow.com/questions/52456874/drop-rows-on-multiple-conditions-in-pandas-dataframe this confirmed that my attempts should work in theory. However, I have found that the problem lies here.. (df['Total'] == None). When removed I get row number changes. However, I now need to figure out why (df['Total'] == None) prevents any rows from being dropped – windwalker Aug 27 '20 at 14:44

1 Answers1

0

One way to go around it is by not using drop by, but instead redefining the df to exclude those conditions by adding ~ ahead of the df

df = df[~((df['Factorization'] != 0.5) & (df['Value'] != 30) & (df['Total'] == None)]))

I may or may not be using one too many () in the code. Please give it a try

emiljoj
  • 399
  • 1
  • 7
  • df = df[~(df['Factorization'] != 0.5) & (df['Value'] > 10) & (df['Total'] != None)] Great help thanks. Adding ~ ahead of the df seems to have done the trick – windwalker Aug 27 '20 at 15:00
  • i'm glad. Would appreciate if you accept it as a solution – emiljoj Aug 27 '20 at 15:06