0

I have a large dataframe with a column populated by Nan and integers. I identified the rows that are NOT empty (i.e. return False for notnull()):

df.loc[df.score.notnull()]

How do I remove these rows and keep the rows with missing values?

This code doesn't work:

df.drop(df.score.notnull()]
cody
  • 11,045
  • 3
  • 21
  • 36
KRDavis
  • 135
  • 1
  • 1
  • 6

2 Answers2

0

You could use df.loc[df.score.isnull()] or df.loc[~df.score.notnull()].

Joe Patten
  • 1,664
  • 1
  • 9
  • 15
  • That only identifies the null, thus flip-flopping what I have already done. – KRDavis Dec 31 '18 at 23:35
  • I assumed that when you said "How do I remove these rows and keep the rows with missing values?" you meant you wanted to keep the rows with null values. Please provide us with an example to show what "missing values" means to you. – Joe Patten Dec 31 '18 at 23:41
  • Yep, that's the question. We're on the same page there. :-) – KRDavis Jan 01 '19 at 02:03
  • Thanks for everyone's input. So removing the loc syntax is what enabled me to keep only the null values in the df. – KRDavis Jan 01 '19 at 02:08
  • Actually, `df.loc[df.score.isnull()]` and `df.[df.score.isnull()]` do the same thing. You might have misread it or input it in wrong. – Joe Patten Jan 02 '19 at 03:20
  • See this answer: https://stackoverflow.com/a/38886211/8345749 – Joe Patten Jan 02 '19 at 03:25
0

Assuming you wanted in the same dataframe you could use:

 df = df[df.score.isnull()]
Polkaguy6000
  • 1,150
  • 1
  • 8
  • 15