1

How can i effectively remove non numeric values from a dataframe column, here is a code snippet that removes all non numeric characters

In [1]: dataset = pd.DataFrame([[653051], [653053], [90 <––9785], [<–{uWÕ¨]], columns=['column'])

dataset.column= dataset.column.replace('[^0-9 ]', '', regex=True)

Output

Out[1]:
    0           653051
    1           653053
    2           90 9785
    3           NaN      <-- Expected Output (for non-numeric values only) 

but there are white spaces in the remaining numeric values and when i use

dataset.column.replace(" ",  "")

or

dataset.column.strip()

it leaves NaN fields where the values were already filled e.g.

After:

0           NaN    <-- Not expected 
1           NaN    <-- Not expected 
2           909785 <-- Expected 
3           NaN    <-- Expected 
Alkari
  • 116
  • 2
  • 11
  • what's your expected output? `df['col'] = df['col'].fillna('')` will just contain an empty string. – David Erickson Jul 23 '20 at 18:25
  • I suggest you create a reproducible example using data from `df.head().to_dict()` for instance. I can tell you from the return of `dataset.column.str.strip()` that you have an `object` column, where some of the values (like the first 3) are `int` and the last value is the string `'90 9785'`. Without data that exactly reproduces the type of each value you'll have a pain getting people to reproduce and solve the problem. – ALollz Jul 23 '20 at 18:28
  • @DavidErickson my expected output is only modifying values that have a space otherwise leave all other values as they are in the column – Alkari Jul 23 '20 at 18:47
  • @ALollz i'm not sure of what you are asking – Alkari Jul 23 '20 at 18:56
  • 1
    @Alkari , this will be an easy question to answer if you can provide a minimum reproducible solution as it really depends on what the data looks like: https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples Depending on the data, the solutions could be `np.where`, `str.replace`, etc. – David Erickson Jul 23 '20 at 19:09
  • Okay, i understand @DavidErickson – Alkari Jul 23 '20 at 19:20

0 Answers0