-3

I was recently given the task to migrate an old website from latin1 to utf8 and got into some trouble. I managed to alter the database, tables, columns correctly(I think), but the only time that the special characters are not shown on the website as '??' is when I before every query run this query mysql_query("SET NAMES 'UTF8'"), then all of the data from the database is printed out correctly, if this query does not exist, all hell breaks lose. The default MariaDB server charset is latin1, can this be the issue? If so how will changing the server charset to utf8 impact other older tables and so on. Perhaps I altered my database incorrectly. I am asking for advice, because I believe that running mysql_query("SET NAMES 'UTF8'") can be slowing down our websites response time. Perhaps there are other issues. We are using php5.6(migration from older version)

I know that mysql_query is an outdated function and should never be used, but my task was migrating, not messing with the code.

Thanks for the help.

mariusslo
  • 24
  • 5
  • How do you open the connection to the database? Consider switching to mysqli and use a function, which does not just open the connection itself, but also set the character set automatically, like `mysql_set_charset` – Refugnic Eternium Nov 27 '20 at 09:55
  • The site is using mysql_connect, actually I think you set us at the right direction, after the connection i ran the query set names and for now it seems the characters are ok. I will try and talk to my co-worker on implementing mysqli in the future. I understand now that if we would change the server charset to utf8 all other databases that are still in latin1 would have problems, is that correct? – mariusslo Nov 27 '20 at 11:22
  • There is always the risk of something breaking in these scenarios, however generally, as long as you only use this particular connection (with this particular connection encoding) for the updated database, you should be fine. – Refugnic Eternium Nov 27 '20 at 11:24
  • Does this answer your question? [UTF-8 all the way through](https://stackoverflow.com/questions/279170/utf-8-all-the-way-through) – Dharman Nov 27 '20 at 17:23
  • Is there any reason why you are using `utf8` and not the full Unicode `utf8mb4`? – Dharman Nov 27 '20 at 17:24

1 Answers1

0

Do not use the mysql_* interface in PHP. Switch to either mysqli_* or PDO. (The rest of what I am about to say does apply to mysql_*.) PHP 7 removed mysql_*; you are one step away from being forced to do the conversion.

The cause of ?? is discussed in Trouble with UTF-8 characters; what I see is not what I stored

SET NAMES sets 3 of the character_set... Variables. They control the encoding in the client. So, make sure the encoding in your PHP code is really UTF-8 or is really latin1/cp1252. Those encoding are different.

The "server" charset is probably irrelevant.

During INSERT or SELECT, MySQL will transcode between the encoding specified for the table's column and the encoding specified by SET NAMES.

In older versions, utf8mb4 was not available. Even in MySQL 5.6, switching from utf8 to utf8mb4 may have some hiccups -- mostly in index sizes. They can be worked around.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • 1
    Side point. In PHP you really should set the charset via the PHP methods instead of `SET NAMES`. This is particularly important if you are using emulated prepares or if you are manually formatting strings for use in SQL. – Dharman Nov 29 '20 at 21:05