2

I am creating a table in SQLite using fts(3 or 4)

CREATE VIRTUAL TABLE Demo1 USING fts3(content TEXT);

insert into Demo1 values('Hồ Thanh Long'),('Nguyễn Văn A')

When search:

select * from Demo1 where content  Match 'Hồ' 

Then result is:

'Hồ Thanh Long'

When search:

select * from Demo1 where content  Match 'Ho' 

Then no result.

Help me!

Terry
  • 989
  • 8
  • 29

2 Answers2

2

You must create the FTS table with a tokenizer that can handle Unicode characters, i.e., ICU or UNICODE61.

Please note that these tokenizers might not be available on all Android versions, and that the Android API does not expose any functions for adding user-defined tokenizers.

CL.
  • 173,858
  • 17
  • 217
  • 259
  • I would be interested to hear your response to this question: http://stackoverflow.com/questions/29669342/unicode-support-for-sqlite-full-text-search-in-android – Suragch Apr 16 '15 at 08:35
2

The default "simple" tokenizer for android supports unicode:

where eligible characters are all alphanumeric characters and all characters with Unicode codepoint values greater than or equal to 128.

It just doesn't do anything else. I'm not sure even the Unicode tokenizers would do the mapping you require. (i.e. recognize 'Hồ' as both 'Hồ' and 'Ho' when queried.)

Indeed, the demo recognized 'Hồ' when you queried it; it just didn't return it when you queried 'Ho' because it didn't recognize them as equivalents. If you are working with a limited set of supported Unicode characters, you could implement your own mapping, and save the "plain ASCII text" in a separate column to search on separately.

sean262 none
  • 212
  • 2
  • 8