So I may have a string 'Bank of China', or 'Embassy of China', and 'International China'
I want to replace all country instances except when we have an 'of ' or 'of the '
Clearly this can be done by iterating through a list of countries, checking if the name contains a country, then checking if before the country 'of ' or 'of the ' exists.
If these do exist then we do not remove the country, else we do remove country. The examples will become:
'Bank of China', or 'Embassy of China', and 'International'
However iteration can be slow, particularly when you have a large list of countries and a large lists of texts for replacement.
Is there a faster and more conditionally based way of replacing the string? So that I can still use a simple pattern match using the Python re library?
My function is along these lines:
def removeCountry(name):
for country in countries:
if country in name:
if 'of ' + country in name:
return name
if 'of the ' + country in name:
return name
else:
name = re.sub(country + '$', '', name).strip()
return name
return name
EDIT: I did find some info here. This does describe how to do an if, but I really want a if not 'of ' if not 'of the ' then replace...