How can i effectively remove non numeric values from a dataframe column, here is a code snippet that removes all non numeric characters
In [1]: dataset = pd.DataFrame([[653051], [653053], [90 <––9785], [<–{uWÕ¨]], columns=['column'])
dataset.column= dataset.column.replace('[^0-9 ]', '', regex=True)
Output
Out[1]:
0 653051
1 653053
2 90 9785
3 NaN <-- Expected Output (for non-numeric values only)
but there are white spaces in the remaining numeric values and when i use
dataset.column.replace(" ", "")
or
dataset.column.strip()
it leaves NaN fields where the values were already filled e.g.
After:
0 NaN <-- Not expected
1 NaN <-- Not expected
2 909785 <-- Expected
3 NaN <-- Expected