How can I read/interpret the emoticons/Unicode characters in a string?
I am creating a CSV export of a data-grid, and would like to create a library of string representations of Twitter emoticons. I would like to replace the emoticon with its string representation.
This is an example of a string:
Absa!!!!
This is what the CSV version looks like:
😂😂😂 Absa!!!!
I would like to render it something like this:
(FACE WITH TEARS OF JOY) (FACE WITH TEARS OF JOY) (FACE WITH TEARS OF JOY) Absa!!!!
I got the details of the Unicode, Bytes (UTF-8) and emoticons from this site: http://apps.timwhitlock.info/emoji/tables/unicode
= U+1F602 \xF0\x9F\x98\x82 FACE WITH TEARS OF JOY
I don't even know where to start! I assume a regex with a bunch of if statement? If an emoticon matches the regex, it gets replaced with its text version.
I have found a bunch of useful posts about removing emoticons, but none on replacing them. This is such an example:
/(?:[\u2700-\u27bf]|(?:\ud83c[\udde6-\uddff]){2}|[\ud800-\udbff][\udc00-\udfff]|[\u0023-\u0039]\ufe0f?\u20e3|\u3299|\u3297|\u303d|\u3030|\u24c2|\ud83c[\udd70-\udd71]|\ud83c[\udd7e-\udd7f]|\ud83c\udd8e|\ud83c[\udd91-\udd9a]|\ud83c[\udde6-\uddff]|[\ud83c[\ude01-\ude02]|\ud83c\ude1a|\ud83c\ude2f|[\ud83c[\ude32-\ude3a]|[\ud83c[\ude50-\ude51]|\u203c|\u2049|[\u25aa-\u25ab]|\u25b6|\u25c0|[\u25fb-\u25fe]|\u00a9|\u00ae|\u2122|\u2139|\ud83c\udc04|[\u2600-\u26FF]|\u2b05|\u2b06|\u2b07|\u2b1b|\u2b1c|\u2b50|\u2b55|\u231a|\u231b|\u2328|\u23cf|[\u23e9-\u23f3]|[\u23f8-\u23fa]|\ud83c\udccf|\u2934|\u2935|[\u2190-\u21ff])/g
There are a bunch of other useful answers in the same post: How to remove emoji code using javascript?
I would appreciate your feedback and input and suggestions!
Thank you!