12

I need functions to convert between a character (e.g. 'α') and its full Unicode name (e.g. "GREEK SMALL LETTER ALPHA") in both directions.

The solution I came up with is to perform a lookup in the official Unicode Standard available online: http://www.unicode.org/Public/6.2.0/ucd/UnicodeData.txt, or, rather, in its cached local copy, possibly converted to a suitable collection beforehand to improve the lookup performance).

Is there a simpler way to do these conversions? I would prefer a solution in C#, but solutions in other languages that can be adapted to C# / .NET are also welcome. Thanks!

Oksana Gimmel
  • 937
  • 8
  • 13
  • The solution you've got sounds perfectly fine to me, to be honest. The file format looks reasonably simple, and I don't think there's anything else in the framework. – Jon Skeet Jun 25 '13 at 19:06
  • That link you point to is only like the first 1 million and should easily fit in a Dictionary. The character should be unique so use it as a key in a Dictionary. As for the description - if it is unique could include a second reverse dictionary for speed but that will double the memory. – paparazzo Jun 25 '13 at 19:36
  • @Blam "only like the first 1 million" (more precisely 1114109) is **all of them**. – R. Martinho Fernandes Jun 27 '13 at 09:40

1 Answers1

0

if you do not want to keep unicode name table in memory just prepare text file where offset of unicode value multiplied by max unicode length name will point to unicode name. for max 4 bytes length it wont be mroe than few megabytes. If you wish to have more compact implementation then group offset address in file to unicode names at start of file indexed by unicode value then enjoy more compact name table. but you have to prepare such file though it is not difficult.

Asaf Sh.
  • 49
  • 2
  • Requirement is both directions. – paparazzo Jun 25 '13 at 19:22
  • Yep, you are write, we can make another file with indice equal to hash values of unicode names :-). though solution referenced in comment to question uses ready dictionary lib which is better of course than making bicycle from scratch. but i always enjoye art of data structure design. – Asaf Sh. Jun 25 '13 at 19:27