DataFrame.drop not dropping expected rows in Pandas

Question

I have a Pandas DataFrame that includes rows that I want to drop based on values in a column "population":

data['population'].value_counts()

general population                          21
developmental delay                         20
sibling                                      2
general population + developmental delay     1
dtype: int64

here, I want to drop the two rows that have sibling as the value. So, I believe the following should do the trick:

data = data.drop(data.population=='sibling', axis=0)

It does drop 2 rows, as you can see in the resulting value counts, but they were not the rows with the specified value.

data.population.value_counts()

developmental delay                         20
general population                          19
sibling                                      2
general population + developmental delay     1
dtype: int64

Any idea what is going on here?

This is kind of a duplicate of http://stackoverflow.com/questions/18172851 — naught101, Jun 10 '14 at 05:09

joaquin · Answer 1 · 2013-11-02T13:28:01.187

7

dataFrame.drop accepts an index (list of labels) as a parameter, not a mask.
To use drop you should do:

data = data.drop(data.index[data.population == 'sibling'])

however it is much simpler to do

data = data[data.population != 'sibling']

edited Nov 02 '13 at 13:28

answered Nov 02 '13 at 13:22

joaquin

82,968
29
138
152

You can do a similar thing with a condition over multiple columns: `data = data[(data[['col1','col2','col3']] != 0).all(axis=1)]` - to drop all rows with zeros in at least one of those columns. – naught101 Jun 10 '14 at 05:07
1

caution: method 1 will not re-index the data! – Subspacian Feb 04 '16 at 15:56

DataFrame.drop not dropping expected rows in Pandas

1 Answers1