-1

Could someone please explain why loc behaves different from what I expect?

The code is

educated_less = df.loc[ ~df['education'].isin(['Masters', 'Bachelors', 'Doctorate'])]

It seems that loc should return only one column 'education' following the isin condition however it returns entire df dataframe with all of the columns and isin condition is applied.

Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
Nikita
  • 3
  • 2
  • Welcome Nikita, we encourage researching your issue before posting an answer. Please read pandas' user guide section on [indexing with isin](https://pandas.pydata.org/docs/user_guide/indexing.html#indexing-with-isin), notice how `.loc` can select both row and column indexes as in `.loc[rows, columns]`, you are skipping the column parameter with `.loc[rows]` then all columns will be returned by default – RichieV Aug 10 '20 at 07:35

1 Answers1

0

Because .loc indexing has two parts:

df.loc[left_part, right_part]

left_part <- where you define by which index to filter
right_part <- where you define which columns you want to keep

you are missing the right_part. So, you can do:

educated_less = df.loc[ ~df['education'].isin(['Masters', 'Bachelors', 'Doctorate']), ['education']]
YOLO
  • 20,181
  • 5
  • 20
  • 40