I have a string in Python like this:
u'\u200cHealth & Fitness'
How can i remove the
\u200c
part from the string ?
I have a string in Python like this:
u'\u200cHealth & Fitness'
How can i remove the
\u200c
part from the string ?
You can encode it into ascii
and ignore errors:
u'\u200cHealth & Fitness'.encode('ascii', 'ignore')
Output:
'Health & Fitness'
If you have a string that contains Unicode
character, like
s = "Airports Council International \u2013 North America"
then you can try:
newString = (s.encode('ascii', 'ignore')).decode("utf-8")
and the output will be:
Airports Council International North America
Upvote if helps :)
I just use replace because I don't need it:
varstring.replace('\u200c', '')
Or in your case:
u'\u200cHealth & Fitness'.replace('\u200c', '')
for me the following worked
mystring.encode('ascii', 'ignore').decode('unicode_escape')
In the specific case in the question: that the string is prefixed with a single u'\200c'
character, the solution is as simple as taking a slice that does not include the first character.
original = u'\u200cHealth & Fitness'
fixed = original[1:]
If the leading character may or may not be present, str.lstrip may be used
original = u'\u200cHealth & Fitness'
fixed = original.lstrip(u'\u200c')
The same solutions will work in Python3. From Python 3.9, str.removeprefix is also available
original = u'\u200cHealth & Fitness'
fixed = original.removeprefix(u'\u200c')