0

Here is a MySQL query:

SELECT last_name FROM users ORDER BY last_name

We get all that data into PHP and then run:

$prior = NULL;
do {
    $current = array_shift($results);
    assert($current >= $prior);
    $prior = $current;
} while ($current !== NULL);

Currently this assertion fails for certain inputs. Is it possible to add a COLLATE clause to the MySQL query above to absolutely guarantee that PHP assertion?


Stuff I tried:

  • The above code, it doesn't work for certain non-ASCII inputs
  • ORDER BY email COLLATE utf8_bin resulted in COLLATION 'utf8_bin' is not valid for CHARACTER SET 'latin1'
  • Maybe this is a duplicate of What is the best collation to use for MySQL with PHP? but that seemed more subjective, I am seeking a specific answer to a specific problem

Others notes:

William Entriken
  • 37,208
  • 23
  • 149
  • 195
  • where are you setting `$next` ? – cmorrissey Jul 06 '16 at 15:55
  • you could do `ORDER BY email COLLATE 'latin1_bin'` or you could say set your collation of the column to `utf8`, but you can't collate by a different sub set in your query. – cmorrissey Jul 06 '16 at 16:09
  • 1
    Is there any reason you're using a `latin1` column in 2016? That seems overly restrictive. – tadman Jul 06 '16 at 16:41
  • @cmorrissey, thank you code corrected – William Entriken Jul 06 '16 at 19:02
  • @tadman Thank you, I reported upstream https://github.com/sequelpro/sequelpro/issues/2530 – William Entriken Jul 06 '16 at 19:07
  • You can [set the default encoding and collation at the database level](https://dev.mysql.com/doc/refman/5.7/en/charset-applications.html) if you want. This might not be the fault of Sequel Pro here if your defaults are `latin1`. As people really like to use things like emoji lately, you may want to go one further and use `utf8mb4` as a default. – tadman Jul 06 '16 at 19:14
  • What `CHARACTER SET` are you using? It sounds like latin1. Perhaps you wanted utf8 instead? (It's hard to address your question without knowing these details.) – Rick James Jul 07 '16 at 01:47
  • @RickJames it is latin1 character set – William Entriken Jul 09 '16 at 22:41

1 Answers1

1
echo ('a' == 'A') ? 'a==A ' : 'a <> A ';
echo ('a' == 'á') ? 'a==á ' : 'a <> á ';

Both came back <>, so I deduce that PHP acts like latin1_bin (or the _bin of whatever charset you have).

Rick James
  • 135,179
  • 13
  • 127
  • 222