In PostgreSQL 12+ it's possible to use nondeterministic ICU collations in order to implement accent insensitive search (see this StackOverflow answer).
I want to implement accent insensitive search, while still using accent sensitive sorting.
Consider the following example:
I have a table named country_region with a single column: "symbol". It stores the following values:
- lubuskie
- łódzkie
- małopolskie
The values have been listed in the expected order, as the Polish alphabet order is L -> Ł -> M.
The below SQL query should return the "łódzkie" field, ignoring the diacritics:
SELECT * FROM country_region WHERE symbol = 'lodzkie'
The below SQL query should return the values ordered per the alphabet (lubuskie -> łódzkie -> małopolskie):
SELECT * FROM country_region ORDER BY symbol
What I have tried
I have tried using 2 different ICU locales.
In the below collation Unicode locale search works as expected, but rows are sorted in the wrong order:
CREATE COLLATION ignore_accent (provider = icu, locale = 'und-u-ks-level1-kc-true', deterministic = FALSE);
CREATE INDEX test_index ON country_region(symbol COLLATE ignore_accent);
In the below collation Polish locale sorting works as expected, but diacritics aren't ignored in search:
CREATE COLLATION ignore_accent (provider = icu, locale = 'pl-u-ks-level1-kc-true', deterministic = FALSE);
CREATE INDEX test_index ON country_region(symbol COLLATE ignore_accent);
Is it possible to solve this with PostgreSQL? Thanks in advance!