0

Someone texted me this character: ◡̈ They were using it as a smiley emoji, and depending where it's rendered, it looks either like two dots over a curve or two dots to the right of a curve.

When I saw it, I wondered if it came from some other language, so I opened wiki's list of unicode characters and searched it. There are zero matches. Yet I know it must be there somehow since I'm able to use it.

Now here's the next thing: when I pasted the character into the searchbox in chrome to string-search the wiki page, it's a single character. But when I hit backspace, it only removes the dots, leaving the semi-circle which immediately matches U+25E1, "lower half circle."

So what is this? An umlaut over a lower half circle character? How does that work? Can you add accent marks on any arbitrary unicode character? If so, how? And how does that work with encoding?

temporary_user_name
  • 35,956
  • 47
  • 141
  • 220

1 Answers1

1

What's copied from the title (◡̈) is encoded in UTF-8 as two characters with codes U+25E1 (0xE2 0x97 0xA1) and U+0308 (0xCC 0x88).

  • U+25E1 is LOWER HALF CIRCLE.
  • U+0308 is COMBINING DIAERESIS.

The 'combining diaeresis' combines with another character. It's not obvious why the characters are rendered differently in different places. The Unicode standard says Chapter 2: General Structure p21:

Combining characters (such as accents) are stored following the base character to which they apply, but are positioned relative to that base character and thus do not follow a simple linear progression in the final rendered text.

You can find plenty of evidence for 'zalgo' on SO — they are mighty piles of combining characters:

 

@̮̘̮̜̤͓͓̓ͪ̓͆͗̑Ṷ̫̠̤̙̻͚̗ͭs̹͓̰̫͉̲̺̈̏̽̅̑ͩ̇̓̉e͖̝̦̦̿r͔̒̿̋̂̓n̹͖̥ͥͦͤ̍͊̏ä͇͖͚͖̃̎͊m̭͇̂͆͋̋͒e̫̠͇̰̱̦̹͗͋̓̿͒ ͔͖̫̬̗̪̪̳ͧ̄ͫB̜̥̣̬̮͈͒̄ͪ͊l̮͉̣̟̪̪̿̍ͫ͋͐̑a̜̦̪͗͗̈́ͣ͊ḫ̘̯͈̠̞͒ͯ ̣͕͚̗̠͖̫̆͌͒̓͛b̖̣͇̖̦̃̑ͬͭͥl͔͍͚͕̲̪̼͎ͧ̇̏ạ̖̪͚̯̊ͤͣͦͮ̌h̘͓͔̟͔͍̏ͣͦ̓̓ ̫̼̫ͮ͌̄ͤ̿̈͆b̙͍̼̜͍̹̬̬͎ͥ̓ͯ̂ḽ̜̟̲̾̅̆ͦ̃ͨa͇̰̝̺͊ͧͫ͛h̯̻͉̉̒̉̈́́ͥ̀.̖̩̭͇̭͔̹̈́̇͐ͬͦͦͨ̾̇.͍̪̣͂ͬ.̞͍̥̪̺̤̣̜͆ͫ̈́͑ͦ͂͑͑

 

See also:


This shows the Unicode code points in the Zalgo text, 4 code points per line, working across the page and then down the page. U+0040 is COMMERCIAL AT followed by 25 combining characters, for example, the first of which is U+032E COMBINING BREVE BELOW.

0x40 = U+0040            0xCC 0xAE = U+032E       0xCC 0x98 = U+0318       0xCC 0xAE = U+032E
0xCC 0x9C = U+031C       0xCC 0xA4 = U+0324       0xCD 0x93 = U+0353       0xCD 0x93 = U+0353
0xCC 0x93 = U+0313       0xCD 0xAA = U+036A       0xCC 0x93 = U+0313       0xCD 0x86 = U+0346
0xCD 0x97 = U+0357       0xCC 0x91 = U+0311       0xE1 0xB9 0xB6 = U+1E76  0xCC 0xAB = U+032B
0xCC 0xA0 = U+0320       0xCC 0xA4 = U+0324       0xCC 0x99 = U+0319       0xCC 0xBB = U+033B
0xCD 0x9A = U+035A       0xCC 0x97 = U+0317       0xCD 0xAD = U+036D       0x73 = U+0073
0xCC 0xB9 = U+0339       0xCD 0x93 = U+0353       0xCC 0xB0 = U+0330       0xCC 0xAB = U+032B
0xCD 0x89 = U+0349       0xCC 0xB2 = U+0332       0xCC 0xBA = U+033A       0xCC 0x88 = U+0308
0xCC 0x8F = U+030F       0xCC 0xBD = U+033D       0xCC 0x85 = U+0305       0xCC 0x91 = U+0311
0xCD 0xA9 = U+0369       0xCC 0x87 = U+0307       0xCC 0x93 = U+0313       0xCC 0x89 = U+0309
0x65 = U+0065            0xCD 0x96 = U+0356       0xCC 0x9D = U+031D       0xCC 0xA6 = U+0326
0xCC 0xA6 = U+0326       0xCC 0xBF = U+033F       0x72 = U+0072            0xCD 0x94 = U+0354
0xCC 0x92 = U+0312       0xCC 0xBF = U+033F       0xCC 0x8B = U+030B       0xCC 0x82 = U+0302
0xCC 0x93 = U+0313       0x6E = U+006E            0xCC 0xB9 = U+0339       0xCD 0x96 = U+0356
0xCC 0xA5 = U+0325       0xCD 0xA5 = U+0365       0xCD 0xA6 = U+0366       0xCD 0xA4 = U+0364
0xCC 0x8D = U+030D       0xCD 0x8A = U+034A       0xCC 0x8F = U+030F       0xC3 0xA4 = U+00E4
0xCD 0x87 = U+0347       0xCD 0x96 = U+0356       0xCD 0x9A = U+035A       0xCD 0x96 = U+0356
0xCC 0x83 = U+0303       0xCC 0x8E = U+030E       0xCD 0x8A = U+034A       0x6D = U+006D
0xCC 0xAD = U+032D       0xCD 0x87 = U+0347       0xCC 0x82 = U+0302       0xCD 0x86 = U+0346
0xCD 0x8B = U+034B       0xCC 0x8B = U+030B       0xCD 0x92 = U+0352       0x65 = U+0065
0xCC 0xAB = U+032B       0xCC 0xA0 = U+0320       0xCD 0x87 = U+0347       0xCC 0xB0 = U+0330
0xCC 0xB1 = U+0331       0xCC 0xA6 = U+0326       0xCC 0xB9 = U+0339       0xCD 0x97 = U+0357
0xCD 0x8B = U+034B       0xCC 0x93 = U+0313       0xCC 0xBF = U+033F       0xCD 0x92 = U+0352
0x20 = U+0020            0xCD 0x94 = U+0354       0xCD 0x96 = U+0356       0xCC 0xAB = U+032B
0xCC 0xAC = U+032C       0xCC 0x97 = U+0317       0xCC 0xAA = U+032A       0xCC 0xAA = U+032A
0xCC 0xB3 = U+0333       0xCD 0xA7 = U+0367       0xCC 0x84 = U+0304       0xCD 0xAB = U+036B
0x42 = U+0042            0xCC 0x9C = U+031C       0xCC 0xA5 = U+0325       0xCC 0xA3 = U+0323
0xCC 0xAC = U+032C       0xCC 0xAE = U+032E       0xCD 0x88 = U+0348       0xCD 0x92 = U+0352
0xCC 0x84 = U+0304       0xCD 0xAA = U+036A       0xCD 0x8A = U+034A       0x6C = U+006C
0xCC 0xAE = U+032E       0xCD 0x89 = U+0349       0xCC 0xA3 = U+0323       0xCC 0x9F = U+031F
0xCC 0xAA = U+032A       0xCC 0xAA = U+032A       0xCC 0xBF = U+033F       0xCC 0x8D = U+030D
0xCD 0xAB = U+036B       0xCD 0x8B = U+034B       0xCD 0x90 = U+0350       0xCC 0x91 = U+0311
0x61 = U+0061            0xCC 0x9C = U+031C       0xCC 0xA6 = U+0326       0xCC 0xAA = U+032A
0xCD 0x97 = U+0357       0xCD 0x97 = U+0357       0xCC 0x88 = U+0308       0xCC 0x81 = U+0301
0xCD 0xA3 = U+0363       0xCD 0x8A = U+034A       0xE1 0xB8 0xAB = U+1E2B  0xCC 0x98 = U+0318
0xCC 0xAF = U+032F       0xCD 0x88 = U+0348       0xCC 0xA0 = U+0320       0xCC 0x9E = U+031E
0xCD 0x92 = U+0352       0xCD 0xAF = U+036F       0x20 = U+0020            0xCC 0xA3 = U+0323
0xCD 0x95 = U+0355       0xCD 0x9A = U+035A       0xCC 0x97 = U+0317       0xCC 0xA0 = U+0320
0xCD 0x96 = U+0356       0xCC 0xAB = U+032B       0xCC 0x86 = U+0306       0xCD 0x8C = U+034C
0xCD 0x92 = U+0352       0xCC 0x93 = U+0313       0xCD 0x9B = U+035B       0x62 = U+0062
0xCC 0x96 = U+0316       0xCC 0xA3 = U+0323       0xCD 0x87 = U+0347       0xCC 0x96 = U+0316
0xCC 0xA6 = U+0326       0xCC 0x83 = U+0303       0xCC 0x91 = U+0311       0xCD 0xAC = U+036C
0xCD 0xAD = U+036D       0xCD 0xA5 = U+0365       0x6C = U+006C            0xCD 0x94 = U+0354
0xCD 0x8D = U+034D       0xCD 0x9A = U+035A       0xCD 0x95 = U+0355       0xCC 0xB2 = U+0332
0xCC 0xAA = U+032A       0xCC 0xBC = U+033C       0xCD 0x8E = U+034E       0xCD 0xA7 = U+0367
0xCC 0x87 = U+0307       0xCC 0x8F = U+030F       0xE1 0xBA 0xA1 = U+1EA1  0xCC 0x96 = U+0316
0xCC 0xAA = U+032A       0xCD 0x9A = U+035A       0xCC 0xAF = U+032F       0xCC 0x8A = U+030A
0xCD 0xA4 = U+0364       0xCD 0xA3 = U+0363       0xCD 0xA6 = U+0366       0xCD 0xAE = U+036E
0xCC 0x8C = U+030C       0x68 = U+0068            0xCC 0x98 = U+0318       0xCD 0x93 = U+0353
0xCD 0x94 = U+0354       0xCC 0x9F = U+031F       0xCD 0x94 = U+0354       0xCD 0x8D = U+034D
0xCC 0x8F = U+030F       0xCD 0xA3 = U+0363       0xCD 0xA6 = U+0366       0xCC 0x93 = U+0313
0xCC 0x93 = U+0313       0x20 = U+0020            0xCC 0xAB = U+032B       0xCC 0xBC = U+033C
0xCC 0xAB = U+032B       0xCD 0xAE = U+036E       0xCD 0x8C = U+034C       0xCC 0x84 = U+0304
0xCD 0xA4 = U+0364       0xCC 0xBF = U+033F       0xCC 0x88 = U+0308       0xCD 0x86 = U+0346
0x62 = U+0062            0xCC 0x99 = U+0319       0xCD 0x8D = U+034D       0xCC 0xBC = U+033C
0xCC 0x9C = U+031C       0xCD 0x8D = U+034D       0xCC 0xB9 = U+0339       0xCC 0xAC = U+032C
0xCC 0xAC = U+032C       0xCD 0x8E = U+034E       0xCD 0xA5 = U+0365       0xCC 0x93 = U+0313
0xCD 0xAF = U+036F       0xCC 0x82 = U+0302       0xE1 0xB8 0xBD = U+1E3D  0xCC 0x9C = U+031C
0xCC 0x9F = U+031F       0xCC 0xB2 = U+0332       0xCC 0xBE = U+033E       0xCC 0x85 = U+0305
0xCC 0x86 = U+0306       0xCD 0xA6 = U+0366       0xCC 0x83 = U+0303       0xCD 0xA8 = U+0368
0x61 = U+0061            0xCD 0x87 = U+0347       0xCC 0xB0 = U+0330       0xCC 0x9D = U+031D
0xCC 0xBA = U+033A       0xCD 0x8A = U+034A       0xCD 0xA7 = U+0367       0xCD 0xAB = U+036B
0xCD 0x9B = U+035B       0x68 = U+0068            0xCC 0xAF = U+032F       0xCC 0xBB = U+033B
0xCD 0x89 = U+0349       0xCC 0x89 = U+0309       0xCC 0x92 = U+0312       0xCC 0x89 = U+0309
0xCC 0x88 = U+0308       0xCC 0x81 = U+0301       0xCC 0x81 = U+0301       0xCD 0xA5 = U+0365
0xCC 0x80 = U+0300       0x2E = U+002E            0xCC 0x96 = U+0316       0xCC 0xA9 = U+0329
0xCC 0xAD = U+032D       0xCD 0x87 = U+0347       0xCC 0xAD = U+032D       0xCD 0x94 = U+0354
0xCC 0xB9 = U+0339       0xCC 0x88 = U+0308       0xCC 0x81 = U+0301       0xCC 0x87 = U+0307
0xCD 0x90 = U+0350       0xCD 0xAC = U+036C       0xCD 0xA6 = U+0366       0xCD 0xA6 = U+0366
0xCD 0xA8 = U+0368       0xCC 0xBE = U+033E       0xCC 0x87 = U+0307       0x2E = U+002E
0xCD 0x8D = U+034D       0xCC 0xAA = U+032A       0xCC 0xA3 = U+0323       0xCD 0x82 = U+0342
0xCD 0xAC = U+036C       0x2E = U+002E            0xCC 0x9E = U+031E       0xCD 0x8D = U+034D
0xCC 0xA5 = U+0325       0xCC 0xAA = U+032A       0xCC 0xBA = U+033A       0xCC 0xA4 = U+0324
0xCC 0xA3 = U+0323       0xCC 0x9C = U+031C       0xCD 0x86 = U+0346       0xCD 0xAB = U+036B
0xCC 0x88 = U+0308       0xCC 0x81 = U+0301       0xCD 0x91 = U+0351       0xCD 0xA6 = U+0366
0xCD 0x82 = U+0342       0xCD 0x91 = U+0351       0xCD 0x91 = U+0351       0x0A = U+000A
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278