How can I replace characters, such as emojis , that cannot be handled by a UTF8 MySQL DB?
The key is to ONLY remove those characters that cannot be handled. I got this code from this answer removing emojis from a string in Python, but it's removing too much. (EDIT: This is the page that I got the code below from remove unicode emoji using re in python)
myre = re.compile(u'('
u'\ud83c[\udf00-\udfff]|'
u'\ud83d[\udc00-\ude4f\ude80-\udeff]|'
u'[\u2600-\u26FF\u2700-\u27BF])+',
re.UNICODE)
my_text= myre.sub(r'EMOJI', my_text)
For example, this heart symbol ♥ can be saved to the DB, but is caught by the above regexp.