3

Original question:

Table structure:

CREATE TABLE `texts` (
  `letter` VARCHAR(1) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
  `text` VARCHAR(255) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
  INDEX (`letter` ASC),
  INDEX (`text` ASC)
)
ENGINE InnoDB
CHARACTER SET utf8
COLLATE utf8_general_ci;

Sample data:

INSERT INTO `texts`
(`letter`, `text`)
VALUES
('a', 'Apple'),
('ā', 'Ābols'),
('b', 'Bull'),
('c', 'Cell'),
('č', 'Čakste');

The query which I'm executing:

SELECT DISTINCT `letter` FROM `texts`;

Expected results:

`letter`
a
ā
b
c
č

Actual results:

`letter`
a
b
c

I've tried many utf8 collations (utf8_[bin|general_ci|unicode_ci], utf8mb4_[bin|general_ci|unicode_ci] etc), none of them work. How to fix this?

Edit for clarification: what I want is not just to get all the letters out, but also get them in the order I specified in the expected results. utf8_bin gets all the letters, but they are ordered in the wrong way - extended latin characters follow only after all the basic latin characters (example: a, b, c, ā, č). Also, the actual table I'm using has many texts per letter, so grouping is a must.

Edit #2: here's the full table data from the live site - http://pastebin.com/cH2DUzf3 Executing that SQL and running the following query after that:

SELECT DISTINCT BINARY `letter` FROM `texts` ORDER BY `letter` ASC

yields almost perfect results, with one exception: the letter 'ū' is before 'u', which is weird to say the least, because all other extended latin letters show up after their basic latin versions. How do I solve this one last problem?

jurchiks
  • 1,354
  • 4
  • 25
  • 55

1 Answers1

3

Check Manual for BINARY type

SELECT DISTINCT BINARY `letter` FROM `texts` 

Check SQL Fiddle

Sadikhasan
  • 18,365
  • 21
  • 80
  • 122
  • That works great, thanks! phpMyAdmin displays the extended latin letters like 'c481' instead of 'ā', but when exporting the resultset, it shows up fine. – jurchiks Jun 28 '14 at 11:22
  • Ah, one thing though. I have the INDEX on the `letter` column and by default it orders the letters correctly, but if I add `ORDER BY letter ASC` to the query, the ordering becomes wrong, the extended latin characters go after basic latin again. Why is that? Here's an example of what I mean: http://pastebin.com/57an2pCX – jurchiks Jun 28 '14 at 11:32
  • Check this link http://sqlfiddle.com/#!2/4f5eb5/1 there is no any problem in Order clause – Sadikhasan Jun 28 '14 at 11:47
  • Yeah there is, that fiddle outputs 'ā' before 'a' and 'ū' before 'u'. – jurchiks Jun 28 '14 at 13:09