0

I do:

$conn->real_connect($host,$user,$pass,$someUTF8Schema);

Then I print what I get from:

$conn->get_charset()

It is:

charset = "latin1" collation = "latin1_swedish_ci"

Even though my schema and all tables in the DB are utf8.

To prove it, I do:

SELECT DEFAULT_CHARACTER_SET_NAME, DEFAULT_COLLATION_NAME
FROM INFORMATION_SCHEMA.SCHEMATA WHERE SCHEMA_NAME = $someUTF8Schema;
->
# DEFAULT_CHARACTER_SET_NAME, DEFAULT_COLLATION_NAME
'utf8', 'utf8_unicode_ci'

Why is PHP MySQLi returning the wrong collation? Thanks!

jn1kk
  • 5,012
  • 2
  • 45
  • 72

1 Answers1

0

There are multiple things that need to coordinated.

  1. What encoding are the characters you are manipulating in PHP?
  2. Have you told MySQL that encoding via the connection call or set_charset or SET NAMES?
  3. You can fetch that via `get_charset().
  4. But the database/table/column do not have to be the same as that --
  5. When creating a database, you can establish the default charset for new tables in that database.
  6. When creating a table, you can override that default by explicitly specifying charset for columns in that table.
  7. When specifying columns, you can override that default by explicitly specifying the charset for that column.
  8. When INSERTing bytes being inserted are interpreted according to step 2, then converted to the column's charset (#7).
  9. Similarly, when SELECTing, the characters are re-converted (#7->#2) if necessary.

Note, I did not mention COLLATION, only CHARACTER SET. They are different animals. COLLATION refer to how to compare characters in a given CHARACTER SET.

See the Best Practice section of Trouble with utf8 characters; what I see is not what I stored .

Community
  • 1
  • 1
Rick James
  • 135,179
  • 13
  • 127
  • 222