0

I have a dataframe that contains values like:

|column a|
---------
|3.5M+   |
|100,000 |
|214,123 |
|1.25M+  |

I want to convert values like 3.5M+ to 3,500,000

I've tried:

regex1 = r'.+M+'
for i in df.a:
    b = re.match(regex1, i)
    if b is not None:
        i = int(np.double(b.string.removesuffix('M+'))*1000000)
    else:
        i = i.replace(',','')

if I add print statements through out, it looks like it's iterating correctly. Unforunately, the changes are not saved to the dataframe.

BMM925
  • 1

1 Answers1

0
>>> import pandas as pd
>>> df = pd.DataFrame({'column_a' : ['3.5M+', '100,000', '214,123', '1.25M+']})
>>> df

    column_a
0   3.5M+
1   100,000
2   214,123
3   1.25M+
>>> df.column_a = df.column_a.str.replace("M\+", '*1000000').str.replace(",", '').apply(eval)
>>> df

    column_a
0   3500000.0
1   100000.0
2   214123.0
3   1250000.0
Amir saleem
  • 1,404
  • 1
  • 8
  • 11