What I would do is:
df['count'] = pd.to_numeric(df['count'], errors='coerce')
After that, your column will be np.float64
and anything that could not be converted to a float will be np.nan
.
A common way to convert such a column to int
is to chose a value to replace 'nan'. That is application-dependent, of course, but since your column name is 'count', a value of -1 could perhaps be adequate.
Alternatively, you can use pandas' nullable integer.
Example
df = pd.DataFrame('4 nan nan 1 nan 227.0 1 8 None'.split(), columns=['count'])
>>> df
count
0 4
1 nan
2 nan
3 1
4 nan
5 227.0
6 1
7 8
8 None
Method 1: convert to numeric, then to int with -1 to indicate "bad value":
newdf = df.assign(
count=pd.to_numeric(df['count'], errors='coerce')
.fillna(-1)
.astype(int)
)
>>> newdf
count
0 4
1 -1
2 -1
3 1
4 -1
5 227
6 1
7 8
8 -1
Method 2: convert to 'Int64' (nullable integer):
newdf = df.assign(
count=pd.to_numeric(df['count'], errors='coerce')
.astype('Int64')
)
>>> newdf
count
0 4
1 <NA>
2 <NA>
3 1
4 <NA>
5 227
6 1
7 8
8 <NA>