22

I have a txt file of an conversation exported from WhatsApp. WhatsApp supports emoticons in their conversation, and the exported conversation also, to my surprise, contains these emoticons! That is, if I open the text file in a text editor (Text Wrangler on Mac 10.8) I can see the emoticons. The text file is encoded in UTF-8 and there are no resources associated with the file that I can tell.

Can anyone explain to me how these emoticons are being included in the text file and how they are accurately being interpreted by the Text Editor? Is this related to the character encoding at all? Are extra resources included with the text file?

Sean Connolly
  • 5,692
  • 7
  • 37
  • 74

1 Answers1

27

Unicode contains sections which specify emoji as "characters". They're regular characters, you only need a font which can display them. Also see the Unicode Emoji FAQ.

In a text file, characters are basically encoded as numbers in the form of bytes. To display those visually on a computer screen you need a font which contains the visual glyph to render this character. Since the process is always numeric identifier → font → visible glyph, it should be pretty obvious that a "character" can be anything visual, including emoji or any other image.

character viewer

deceze
  • 510,633
  • 85
  • 743
  • 889
  • 1
    Nice explanation. I’d add that when emoticons are encoded using standard Unicode codepoints, like U+1F600, you can use any font that contains them. Sometimes Private Use codepoints are used, and then you need a very specific font that has the emoticons in those “privately agreed” codepoints. – Jukka K. Korpela Sep 30 '13 at 11:11
  • 2
    a few ways to improve this answer: 1) where did you get this chart? 2) example of how to use this chart to insert an emoji / symbol – ahnbizcad Aug 25 '16 at 01:36