Original question:
Table structure:
CREATE TABLE `texts` ( `letter` VARCHAR(1) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL, `text` VARCHAR(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL, INDEX (`letter` ASC), INDEX (`text` ASC) ) ENGINE InnoDB CHARACTER SET utf8 COLLATE utf8_general_ci;
Sample data:
INSERT INTO `texts` (`letter`, `text`) VALUES ('a', 'Apple'), ('ā', 'Ābols'), ('b', 'Bull'), ('c', 'Cell'), ('č', 'Čakste');
The query which I'm executing:
SELECT DISTINCT `letter` FROM `texts`;
Expected results:
`letter` a ā b c č
Actual results:
`letter` a b c
I've tried many utf8 collations (utf8_[bin|general_ci|unicode_ci], utf8mb4_[bin|general_ci|unicode_ci] etc), none of them work. How to fix this?
Edit for clarification: what I want is not just to get all the letters out, but also get them in the order I specified in the expected results.
utf8_bin
gets all the letters, but they are ordered in the wrong way - extended latin characters follow only after all the basic latin characters (example: a, b, c, ā, č). Also, the actual table I'm using has many texts per letter, so grouping is a must.
Edit #2: here's the full table data from the live site - http://pastebin.com/cH2DUzf3 Executing that SQL and running the following query after that:
SELECT DISTINCT BINARY `letter` FROM `texts` ORDER BY `letter` ASC
yields almost perfect results, with one exception: the letter 'ū' is before 'u', which is weird to say the least, because all other extended latin letters show up after their basic latin versions. How do I solve this one last problem?