0

I have a client who is asking me to enable Pinyin searching on an autocomplete input on web app.

The filter hits an Azure SQL database and uses the following where statement.

WHERE countryName LIKE N'%美国%'

This works when the user enters the correct chinese characters. In the above example the input shows the correct country. However in some cases Chinese users want to query using pinyin.

I.e.

WHERE countryName LIKE N'%meiguo%'

**MeiGuo is the pinyin representation for 美国 which stands for United States

I have seen some examples of SQL Server functions which would convert the Chinese characters to pinyin so I could convert both the input and the entire column to western script and do the search but it (A) seems insanely computationally expensive to do this on every record when searching (B) I can imagine some quirks where a search for specific Chinese character will return many records which don't have that character but have the same western characters.

Any ideas on how to do this in a good way?

eshirvana
  • 23,227
  • 3
  • 22
  • 38
  • The operation is already pretty expensive with that leading wild card; a full table scan is going to be required. If, however, you want to be able to search for the pinyin value to, it seems you'd be better off with a computed calculated column that's `PERSISTED` and included in relevant indexes. – Thom A Mar 13 '21 at 16:29
  • Don't do that kind of conversion in T-SQL, it's horrible. Do it in C# on the search input, then search for the exact Chinese that you want. – Charlieface Mar 13 '21 at 21:57
  • I agree with @Charlieface, do the translation in your backend code and search for several possible Chinses words translated from pinyin. For example, 'meiguo' is more likely as '美国'(United States), can also translated to '没过'(haven't pass through), you may got several translation in your code and then search for them in your database. – Tiny Wang Mar 14 '21 at 13:46

0 Answers0