1

Example: I the df['column'] has a bunch of values similar to: F/4500/O or G/2/P

The length of the digits range from 1 to 4 similar to the examples given above.

How can I transform that column to only keep 1449 as an integer?

I tried the split method but I can't get it right. Thank you!

enter image description here

Ilyas
  • 25
  • 5
  • Does this answer your question? [Splitting a pandas dataframe column by delimiter](https://stackoverflow.com/questions/37333299/splitting-a-pandas-dataframe-column-by-delimiter) – Azhar Khan Oct 20 '22 at 03:34

2 Answers2

1

You could extract the value and convert to_numeric:

df['number'] = pd.to_numeric(df['column'].str.extract('/(\d+)/', expand=False))

Example:

     column  number
0  F/4500/O    4500
1     G/2/P       2
mozway
  • 194,879
  • 13
  • 39
  • 75
  • Hmm...what if some of the OP's data (not shown) happens to have more than one numeric value? – Tim Biegeleisen Oct 20 '22 at 03:52
  • @TimBiegeleisen this is not what is suggested in the question, but this would extract the first one, if all needed, then `extractall` + eventually aggregation (but it's a different question) – mozway Oct 20 '22 at 03:53
  • Actually, just using `(\d+)` as regex should be sufficient **if** the format is always `non-digits/digits/non-digits` – mozway Oct 20 '22 at 03:56
0

How's about:

df['column'].map(lambda x: int(x.split('/')[1]))
Igor Rivin
  • 4,632
  • 2
  • 23
  • 35
  • I get the 'list out of index' error. I do have some values with only 2 digits if that matter. I also added str() to x: map(lambda x: int(str(x).split('/')[1])). I'll edit my post about the range of the digits. – Ilyas Oct 20 '22 at 03:34