I have some data I want to delete the some rows but where half of information is missing.
Employee_name
employee: ahmad
employee: ali
employee:
employee: abc
employee:
I want to delete all employee record whos name is missing.
I have some data I want to delete the some rows but where half of information is missing.
Employee_name
employee: ahmad
employee: ali
employee:
employee: abc
employee:
I want to delete all employee record whos name is missing.
Try adding:
df = df.replace('', np.nan)
as suggested by jezrael in Pandas: remove rows with missing data
Another way Data
import pandas as pd
df=pd.DataFrame({'Employee_name':
['employee: ahmad',
'employee: ali',
'employee:',
'employee: abc',
'employee:'] })
df
Extract strings after : and drop NaN
df['Employee']=df['Employee_name'].str.extract(r'(?<=\:)(\s+[a-z]+)')
df.dropna()
Regex explanation
(?<=X)(Y)
Get Y if X precedes it
X
is :
Y
is either space
\s
followed by small alphabets
[a-z]+
text or space
Output
There multiple ways to achieve that, you may do this through the command below as per suggested here
data = data[data.employee != '']
Another approach is as per below:
import pandas as pd
data = pd.DataFrame( {'employee' : ['Ali', '', 'Amed', '', '', 'abc']}, columns = ['employee'])
delRows = data[ data['employee'] == '' ].index
data.drop(delRows , inplace=True)
data
The Output is:
employee
0 Ali
2 Amed
5 abc