These don't seem to index, even when I explicitly add them to my charset_table:
charset_table=... U+20AC->U+20AC, U+00A3->U+00A3
I even tried mapping them to the dollar sign
U+0024->U+0024, U+20AC->U+0024, U+00A3->U+0024
Yet in each case they are unrecognized in other words MATCH('£1000')
will not find 'cost is £1000' and if I try to map to $
as per the second example then MATCH('$1000
)` will not either.
If I do a MySQL Search however where field like '%£%'
I do get records leading me to believe the MySQL is encoding UTF-8 correctly. Meaning the Pound Sign
and Euro
characters are being stored correctly in MySQL but the Sphinx index is not recognizing them regardless, even after I explicitly add their Unicode characters to my charset_table
.
Relevant portion of config:
`min_stemming_len = 1
stopword_step = 0
html_strip = 1
min_word_len = 1
min_infix_len = 0
index_zones = title,description
charset_type = utf8mb4_unicode_ci
charset_table = 0..9, A..Z->a..z, _, a..z, U+0026->U+0026, U+0027->U+0027, U+002E->U+002E, U+002D->U+002D, U+2014->U+002D#, U+2019->U+0027, U+0024->U+0024, U+20AC->U+0024, U+00A3->U+0024
Confirmed that the table/column is using utf8mb4_unicode_ci
Confirmed I can do a mysql search on Euro: Where Title like '%€%'
Confirmed I cannot find same record with SphinxQL: where MATCH('€')