1

After reading the contacts from the user on the device I want to display them in a list grouped by sections. I do this extracting the first human readable letter from the name, removing diacritical marks and creating a section for each different letter. This works well until users type emoji into their contacts, for instance:

Those three items appear on my laptop's contact list as grouped in the #section. However my algorithm creates two sections, and , which is not desirable. I don't want to put anything non ASCII into the # group because users with non latin alphabets won't like that (japanese, russian, korean, etc) but I don't know all these languages so I don't know what should be done for them.

Is there a table I can use to know if the character should go into this numeric # section or should create a human readable section letter? And does this work universally or is it locale related, with certain countries grouping letters in a different way than others due to cultural reasons?

Community
  • 1
  • 1
Grzegorz Adam Hankiewicz
  • 7,349
  • 1
  • 36
  • 78
  • Good question. You will probably get better help if you add some code to show what you have tried so far. – Adriaan Koster Jan 13 '16 at 10:46
  • I'm using the `StringSimplifier` filtering class from the linked question http://stackoverflow.com/questions/1453171/remove-diacritical-marks-%C5%84-%C7%B9-%C5%88-%C3%B1-%E1%B9%85-%C5%86-%E1%B9%87-%E1%B9%8B-%E1%B9%89-%CC%88-%C9%B2-%C6%9E-%E1%B6%87-%C9%B3-%C8%B5-from-unicode-chars/1453284#1453284 which works nice for latin accents. – Grzegorz Adam Hankiewicz Jan 13 '16 at 10:50
  • Just a though: why don't you detect if a character is numerical to determine if the contact should go in the `#` section? And why do these emojii get interpreted as numerical in the first place? In other words: show us your code! – Adriaan Koster Jan 13 '16 at 12:03
  • You are not understanding, the emojis are not being interpreted as numbers, *I* want to interpret them as numbers and put them in the same section, like the OS does. I want to replicate the OS behaviour, but I'm not finding code to do that, and see no obvious API. – Grzegorz Adam Hankiewicz Jan 13 '16 at 14:44
  • You will have to determine the UTF ranges these emojii are in and handle them as special cases. – Adriaan Koster Jan 13 '16 at 15:06
  • And that's what I wrote at the end "Is there a table I can use to know if the character should go into this numeric # section", thanks for going full circle. – Grzegorz Adam Hankiewicz Jan 13 '16 at 15:50
  • Well, I think you will need to determine yourself which characters you wish to consider as emoji and which not. That really depends on your implementation. If you find a solution, please answer your own question here for others to benefit. – Adriaan Koster Jan 13 '16 at 16:17

0 Answers0