I have made a dictionary with about 100k words of Punjabi Language in Unicode. There is a letter ਸ਼
, whose code in unicode is ਸ਼
and there are many such letters like ਖ਼
ਜ਼
ਗ਼
ਫ਼
. But in this language, the dot u see under the letters can also be typed separately, but there are combined letters in unicode. in the db, there are words in word
table and the md5 of the word in word_hash
. When i try to search the database with php with the statement SELECT * FROM db WHERE word_hash = md5('word');
, it results in no records found with words with such letters with the dot. When i tried to search, i found that the md5 of the words in the db and the md5 generated by search syntax is different. Why is it so? I have entered all the words through a textbox and the md5 entered is with mysql syntax.
For ex : the code for the word ਸ਼ਰਬਤ
is 45f756f02a28b5ec48ddf369db6ad7e6
echoed by mysql query and in the db is d6da1a44526c5ab1259dcc05404b1e8c
Two alternates for ਸ਼
are ਸ਼
and ਸ਼