0

I have a Pandas DataFrame. And I am interested in getting a particular column with only numeric characters.

For example, the column contains rows like this:

4'> delay trip
4/
4'>book flight 'trip
34
4"> book flight delay
4"

How can I strip off all non-numeric characters and have just numeric characters like this:

4
4
4
[3,4]
4
4
JA-pythonista
  • 1,225
  • 1
  • 21
  • 44
  • 4
    Does this answer your question? [How can I remove all non-numeric characters from all the values in a particular column in pandas dataframe?](https://stackoverflow.com/questions/44117326/how-can-i-remove-all-non-numeric-characters-from-all-the-values-in-a-particular) – dspencer Mar 10 '20 at 09:58

1 Answers1

2

You have 2 different problems here:

  • first is to extract digits from the column cells
  • second is to make a list if you have more than one digit

Just chain both operations:

df[col].str.findall(r'\d').apply(lambda x: x[0] if len(x) == 1 else '' if len(x) == 0 else x)

With you example it gives:

0         4
1         4
2         4
3    [3, 4]
4         4
5         4
Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252