0

I have a dataframe, df

        Size                  Total           ID
    
        110 TB                200 TB          A
        110 G                 300 G           B
        500 A                 700 A           C

Desired output:

        Size                  Total         ID
    
        110                   200           A
        110                   300           B
        500                   700           C

What I am doing:

  df[['Size','Total']] = df[['Size','Total']].apply(lambda x: 
  x.str.strip(' TB', 'G'))

This specifies the exact value, but how would I remove any value that is not a number value? (Just in case I do not know what the specific value is)

Any suggestion is appreciated

Lynn
  • 4,292
  • 5
  • 21
  • 44
  • 1
    `df.stack().str.replace('[A-z]', '').unstack()`? – It_is_Chris Oct 02 '20 at 20:48
  • 3
    Does this answer your question? [How can I remove all non-numeric characters from all the values in a particular column in pandas dataframe?](https://stackoverflow.com/questions/44117326/how-can-i-remove-all-non-numeric-characters-from-all-the-values-in-a-particular) – John Sloper Oct 02 '20 at 20:50

2 Answers2

4

Could you please try following, I have written this on mobile couldn't test it should work though.

import pandas as pd
df.replace(to_replace=r'[^0-9]+', value='', regex=True)

Simply substituting everything apart from digits with null.

RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
  • Thanks Let me try this now- some of my other columns have other data, will this remove from them too? It’s just the two specific columns size and total that I wish to remove the non numerical values – Lynn Oct 02 '20 at 20:56
  • 1
    This works perfectly fine – Lynn Oct 02 '20 at 21:09
2

Just an extra information, I did like this and got the opposite :D

import pandas as pd
df.replace(to_replace=r'[^a-zA-Z#]', value='', regex=True)

    Size    Total
  0   TB    TB
  1   G      G
  2   A      A

Since you changed your question, I did like this, maybe someone could have a better answer.

df['Size'] = df['Size'].str.replace("[^[^0-9]+", " ") 
df['Total'] = df['Total'].str.replace("[^[^0-9]+", " ") 
df

output:

   Size Total   ID
0   110  200    A
1   100  300    B
2   500  700    C
almo
  • 561
  • 2
  • 16
  • Yes I wanted to add more detail to it- this may put me on the right path- I will try. This works fine as well – Lynn Oct 02 '20 at 21:05