How to drop a row where half of information is missing pandas

Question

I have some data I want to delete the some rows but where half of information is missing.

Employee_name
employee: ahmad
employee: ali
employee:
employee: abc
employee:

I want to delete all employee record whos name is missing.

score 1 · Accepted Answer · answered Apr 17 '20 at 10:56

1

df.loc[df.Employee_name.str.strip().str.strip('employee:').ne('')]

answered Apr 17 '20 at 10:56

Allen Qin

19,507
8
51
67

score 0 · Answer 2 · answered Apr 17 '20 at 10:49

0

Try adding:

df = df.replace('', np.nan)

as suggested by jezrael in Pandas: remove rows with missing data

answered Apr 17 '20 at 10:49

enricw

263
4
19

wwnde · Answer 3 · 2020-04-17T11:01:42.830

Another way Data

import pandas as pd
df=pd.DataFrame({'Employee_name':
['employee: ahmad',
'employee: ali',
'employee:',
'employee: abc',
'employee:'] })
df

Extract strings after : and drop NaN

df['Employee']=df['Employee_name'].str.extract(r'(?<=\:)(\s+[a-z]+)')
df.dropna()

Regex explanation (?<=X)(Y) Get Y if X precedes it X is : Y is either space \s followed by small alphabets [a-z]+ text or space

Output

score 0 · Answer 4 · answered Apr 17 '20 at 10:53

There multiple ways to achieve that, you may do this through the command below as per suggested here

data = data[data.employee != '']

Another approach is as per below:

import pandas as pd 
data = pd.DataFrame( {'employee' :  ['Ali', '', 'Amed', '', '', 'abc']}, columns = ['employee'])
delRows = data[ data['employee'] == '' ].index
data.drop(delRows , inplace=True)
data

The Output is:

    employee
0   Ali
2   Amed
5   abc

How to drop a row where half of information is missing pandas

4 Answers4