0

I have asked another question for my problem with collation setting, which seems to be ignored/overriden by some unwanted defaults. To workaround this, I want to use COLLATE in the queries returning "illegal mix of collations" error. However, when I try to, I get "COLLATION is not valid" error instead.

The query in question (I didn't face this problem with other queries yet), this time with COLLATE on the parameter (? translates to a string of comma-separated numbers):

SELECT k.url 
FROM kml k 
WHERE (
    SELECT GROUP_CONCAT(
        kat.kategorie ORDER BY kat.kategorie 
    )
    FROM kml_kategorie kat
    WHERE kat.kml = k.id
) = ? COLLATE utf8_czech_ci
LIMIT 1;

When running it, I get following error:

COLLATION 'utf8_czech_ci' is not valid for CHARACTER SET 'binary'

When I run show variables like "%character_set_%" in Adminer, I get following results:

---------------------------------------  
| Variable_name            | Value    |  
---------------------------------------  
| character_set_client     | utf8     |  
| character_set_connection | utf8     |  
| character_set_database   | utf8     |  
| character_set_filesystem | binary   |  
| character_set_results    | utf8     |  
| character_set_server     | utf8     |  
| character_set_system     | utf8     |  
---------------------------------------

When I invoke the command from php/mysqli, the only difference is that even character_set_filesystem has value of utf8.

Well, perhaps in this case the database settings are used no matter what is shown to the mysqli. But then I tried to solve any coercibility issues by adding COLLATE to the aggregate function, GROUP_CONCAT:

SELECT k.url 
FROM kml k 
WHERE (
    SELECT GROUP_CONCAT(
        kat.kategorie ORDER BY kat.kategorie COLLATE utf8_czech_ci 
    )
    FROM kml_kategorie kat
    WHERE kat.kml = k.id
) = ?
LIMIT 1;

Now the error is following:

COLLATION 'utf8_czech_ci' is not valid for CHARACTER SET 'latin1'

You can see that there is no latin1 among the charset variable values. It used to be the value of character_set_server before I modified my.cnf file. Its relevant part now looks like this:

## UTF 8 Settings
#init-connect=\'SET NAMES utf8\'
collation_server=utf8_czech_ci
character_set_server=utf8
character-set-filesystem=utf8
#skip-character-set-client-handshake
#character_sets-dir="C:/xampp/mysql/share/charsets"

Where do the charsets shown in the error messages (and causing the problem) come from? I'm bit confused by the differences between querying the variables from mysqli/php and adminer, and totally confused by some other variables making this even bigger mess. How to clean it and change the variables so that at least this this COLLATE workaround worked?

This is basically a subquestion to my linked question, but different enough to merit splitting; the two questions combined would be as messy as my database variables. However, it's quite probable that the solution to one of them will make the other solved as well; after answering, you might consider answering the other question by a summary of your first answer + a note how it applies to the problem specified in the question.

EDIT: I found a workaround for this one particular case. However, I still didn't find why the queries made through Adminer and Mysqli returned different results and especially why the error message points to charsets not included in any variable?

Community
  • 1
  • 1
Pavel V.
  • 2,653
  • 10
  • 43
  • 74

2 Answers2

1

What is the datatype of kat.kategorie ?

GROUP_CONCAT( kat.kategorie
              ORDER BY CONVERT(kat.kategorie USING utf8) COLLATE utf8_czech_ci 
            )

You may also need a call to CAST(... AS CHAR).

In the future, do not mix CHARACTER SETs.

GROUP_CONCAT, and a couple of other functions that involve combining things, have a problem in deducing what the result type should be. In some cases, it punts and calls the result BINARY, which is even worse than what you encountered.

I believe that there is a latin1 somewhere that led to your particular problem. Dig into Adminer. Let's see SHOW CREATE TABLE kat.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • "In the future, do not mix CHARACTER SETs" - my point is how to correct mixed CHARACTER SETs when they don't correspond my variables; see the last edit of my question. Your answer is just another workaround, not as good as mine for this particular situation, but with more general usage. It's a fish, but now I need to learn fishing (making charsets the same across whole system) more. – Pavel V. Mar 07 '15 at 08:50
  • `GROUP_CONCAT`, and a couple of other functions that involve combining things, have a problem in deducing what the result is. In some cases, it punts and calls the result `BINARY`, which leads is even worse than what you encountered. I believe that there is a `latin1` somewhere that led to your particular problem. Dig into Adminer. Let's see `SHOW CREATE TABLE kat`. – Rick James Mar 07 '15 at 16:36
  • your comment explains a lot. Add it to your question and I'll upvote and probably accept it on Monday (I don't have time to consider it thoroughly earlier). – Pavel V. Mar 07 '15 at 16:40
  • After another look at the problem I found its core: kat.kategorie is integer. Mysqli can convert it to char/varchar, but there's no character set and collation stored for the integer column. Playing with defaults to set the correct charset and collation led to a dead end, so I consider your answer a proper solution now. Accepted. – Pavel V. Mar 09 '15 at 07:57
  • Based on another thread, it may be that the character set of a number may come from `@@character_set_filesystem` (or _server). The solution there was "don't collate numbers". – Rick James Mar 09 '15 at 13:14
  • Adminer is getting in the way? That's what is irritating about 3rd party software. – Rick James Mar 09 '15 at 13:15
0

I made another workaround: I created a view in my database to avoid calling GROUP_CONCAT in the mysqli parametrized query. The error no longer appears, but I still don't consider it fully resolved, because there are some strange things I don't understand (variables not correctly set etc.) and I don't want them to make any more problems.

The view:

CREATE 
  ALGORITHM=UNDEFINED DEFINER=`root`@`localhost` 
  SQL 
  SECURITY DEFINER 
VIEW `kml_kat_vw` AS SELECT 
  `kml_kategorie`.`kml` AS `_kml`,
  group_concat(
    `kml_kategorie`.`kategorie`
    ORDER BY `kml_kategorie`.`kategorie` ASC separator ','
  ) AS `_kategorie` 
FROM `kml_kategorie`
GROUP BY `kml_kategorie`.`kml`

The new query:

SELECT k.url
FROM kml k
JOIN kml_kat_vw kat ON kat._kml = k.id
WHERE kat._kategorie = ?;
Pavel V.
  • 2,653
  • 10
  • 43
  • 74