You can use to_numeric
with notnull
and filter by boolean indexing
:
print (pd.to_numeric(df.b, errors='coerce'))
0 26190.0
1 NaN
2 580.0
Name: b, dtype: float64
print (pd.to_numeric(df.b, errors='coerce').notnull())
0 True
1 False
2 True
Name: b, dtype: bool
df = df[pd.to_numeric(df.b, errors='coerce').notnull()]
print (df)
a b
0 1 26190
2 5 580
Another solution by comment of Boud - use to_numeric
with dropna
and last convert to int
by astype
:
df.b = pd.to_numeric(df.b, errors='coerce')
df = df.dropna(subset=['b'])
df.b = df.b.astype(int)
print (df)
a b
0 1 26190
2 5 580
If need check all rows with bad data use isnull
- filter all data where after applying function to_numeric
get NaN
:
print (pd.to_numeric(df.b, errors='coerce').isnull())
0 False
1 True
2 False
Name: b, dtype: bool
print (df[pd.to_numeric(df.b, errors='coerce').isnull()])
a b
1 5 python