I am trying to represent devanagari characters on a screen, but in the dev environment where I'm programming I don't have unicode support. Then, to write characters I use binary matrices to color the related screen's pixels. I sorted these matrices according to the unicode order. For the languages that uses the latin alphabet I had no issues, I only needed to write the characters one after the other to represent a string, but for the devanagari characters it's different.
In the devanagari script some characters, when placed next to others can completely change the appearance of the word itself, both in the order and in the appearance of the characters. The resulting characters are considered as a single character, but when read as unicode they actually return 2 distinct characters.
This merging sometimes occurs in a simple way:
क + ् = क्
ग + ् = ग्
फ + ि = फि
But other times you get completely different characters:
क + ् + क = क्क
ग + ् + घ = ग्घ
क + ् + ष = क्ष
I found several papers describing the complex grammatical rules that determine how these characters merges (https://www.unicode.org/versions/Unicode8.0.0/UnicodeStandard-8.0.pdf), but the more I look into it the more I realize that I need to learn Hindi for understand that rules and then create an algorithm.
I would like to understand the principles behind these characters combinations but without necessarily having to learn the Hindi language. I wonder if anyone before me has already solved this problem or found an alternative solution and would like to share it with me.