I'm trying to remove parts of a string that make it a strong so that it can become an integer. Although, I also need to take into account the changes in the string.
I've tried to put this into a function; here's what I have done:
import numpy as np
def rem(x):
data = []
for i in x:
if "m" in i:
data.append(i.replace(".00m", '000000'))
elif "Th" in i:
data.append(i.replace("Th.", '000'))
return data
data_array = np.array(['£67.50m', '£63.00m', '£49.50m','£90Th.', '£720Th.'], dtype=object)
rem(data_array)
>['£67.50m', '£63000000', '£49.50m', '£90000', '£720000']
How would I take into account that before m
I'll also have numbers from 0-9?
I have tried this in my bigger dataframe but I get the following error:
TypeError: argument of type 'float' is not iterable
Which I'm assuming it's because the function does not take into account .50m, .20m ...
?
Using @Ptit Xav suggestion:
def rem(x):
data = []
for i in x:
if "m" in i:
xi = re.sub("[^\d]", "", i)
data.append(int(xi)*10000)
elif "Th" in i:
hi = re.sub("[^\d]", "", i)
data.append(int(hi)*1000)
return data