I need to remove emoji's from some strings using a python script. I found that someone already asked this question, and one of the answers was marked as successful, namely that the following code would do the trick:
#!/usr/bin/env python
import re
text = u'This dog \U0001f602'
print(text) # with emoji
emoji_pattern = re.compile("["
u"\U0001F600-\U0001F64F" # emoticons
u"\U0001F300-\U0001F5FF" # symbols & pictographs
u"\U0001F680-\U0001F6FF" # transport & map symbols
u"\U0001F1E0-\U0001F1FF" # flags (iOS)
"]+", flags=re.UNICODE)
print(emoji_pattern.sub(r'', text)) # no emoji
I inserted this code into my script, and changed it only to be acting on the strings in my code rather than the sample text. When I run the code, though, I get some errors I don't understand:
Traceback (most recent call last):
File "SCRIPT.py", line 31, in get_tweets
"]+", flags=re.UNICODE)
File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework /Versions/2.7/lib/python2.7/re.py", line 194, in compile
return _compile(pattern, flags)
File "/usr/local/Cellar/python/2.7.12/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 251, in _compile
raise error, v # invalid expression
sre_constants.error: bad character range
I get what the error is saying, but since I grabbed this code from Stackexchange, I cannot figure out why it apparently worked for the people in this discussion but not for me. I'm using Python 2.7 if that helps. Thank you!