I'm working with a dataframe where I wish to change entries in country column, eg:
'Bolivia (Plurinational State of)' should be 'Bolivia',
'Switzerland17' should be 'Switzerland'
I have defined the following function:
def process(w):
for i in range(len(w)):
if w[i] in ['(', ')', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '&', '/']:
w = w[0:i]
w = ''.join(w).replace(" ", "")
break
return w
which I have then applied to the dataframe using the python apply function.
energy['Country'] = energy['Country'].apply(process)
While I have been able to achieve the desired output, it is not entirely correct. Some entries like
United Kingdom of Great Britain and Northern Ireland and United States of America20 have changed to UnitedKingdomofGreatBritainandNorthernIreland and UnitedStatesofAmerica .
What am I doing wrong? Also what would be a more effective, concise code to achieve the result?