mysql select distinct letters, including extended latin characters

Question

Original question:

Table structure:
CREATE TABLE `texts` (
  `letter` VARCHAR(1) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
  `text` VARCHAR(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
  INDEX (`letter` ASC),
  INDEX (`text` ASC)
)
ENGINE InnoDB
CHARACTER SET utf8
COLLATE utf8_general_ci;
Sample data:
INSERT INTO `texts`
(`letter`, `text`)
VALUES
('a', 'Apple'),
('ā', 'Ābols'),
('b', 'Bull'),
('c', 'Cell'),
('č', 'Čakste');
The query which I'm executing:
SELECT DISTINCT `letter` FROM `texts`;
Expected results:
`letter`
a
ā
b
c
č
Actual results:
`letter`
a
b
c
I've tried many utf8 collations (utf8_[bin|general_ci|unicode_ci], utf8mb4_[bin|general_ci|unicode_ci] etc), none of them work. How to fix this?

Edit for clarification: what I want is not just to get all the letters out, but also get them in the order I specified in the expected results. utf8_bin gets all the letters, but they are ordered in the wrong way - extended latin characters follow only after all the basic latin characters (example: a, b, c, ā, č). Also, the actual table I'm using has many texts per letter, so grouping is a must.

Edit #2: here's the full table data from the live site - http://pastebin.com/cH2DUzf3 Executing that SQL and running the following query after that:

SELECT DISTINCT BINARY `letter` FROM `texts` ORDER BY `letter` ASC

yields almost perfect results, with one exception: the letter 'ū' is before 'u', which is weird to say the least, because all other extended latin letters show up after their basic latin versions. How do I solve this one last problem?

Curious to know what is the result of : SELECT `letter` FROM `texts`; — ak0053792, Jun 28 '14 at 11:16
@ak0053792 - that would work for me if the all the texts were starting with a unique letter, but in the actual table, each letter has up to a hundred different texts, so I need to group them. — jurchiks, Jun 28 '14 at 11:17

score 3 · Accepted Answer · answered Jun 28 '14 at 11:16

3

Check Manual for BINARY type

SELECT DISTINCT BINARY `letter` FROM `texts`

Check SQL Fiddle

answered Jun 28 '14 at 11:16

Sadikhasan

18,365
21
80
122

That works great, thanks! phpMyAdmin displays the extended latin letters like 'c481' instead of 'ā', but when exporting the resultset, it shows up fine. – jurchiks Jun 28 '14 at 11:22
Ah, one thing though. I have the INDEX on the `letter` column and by default it orders the letters correctly, but if I add `ORDER BY letter ASC` to the query, the ordering becomes wrong, the extended latin characters go after basic latin again. Why is that? Here's an example of what I mean: http://pastebin.com/57an2pCX – jurchiks Jun 28 '14 at 11:32
Check this link http://sqlfiddle.com/#!2/4f5eb5/1 there is no any problem in Order clause – Sadikhasan Jun 28 '14 at 11:47
Yeah there is, that fiddle outputs 'ā' before 'a' and 'ū' before 'u'. – jurchiks Jun 28 '14 at 13:09

mysql select distinct letters, including extended latin characters

1 Answers1

Linked