0

Currently I have this configuration: Server Windows 2008 R2 Enterprise, Apache 2.4.23, PHP 5.6.25, MySQL 5.7.14

and I'm porting my applications to the a new one with: Server Windows 2019 Datacenter, Apache 2.4.41, PHP 7.4.0, MySQL 8.0.18.

I "copied" some configurations that I know of Apache, PHP, MySQL, headers, in particular those that refer to character encoding like default-charset=iso-8859-1 on php.ini, AddDefaultCharset UTF-8 on httpd.conf, utf8mb4_general_ci collation on MySQL, content="text/html; charset=UTF-8" on headers.

On both I configured with the same parameters an ODBC DSN with iSeries driver pointing to an AS400 db.

To connect with db I use PDO without parameters for charset.

On the actual system I haven't problem with special char like à, °, €: I read from both db, display on html pages. Some applications uses only MySQL, other only ODBC, other both: I get string from AS400 and save on MySQL.

On the new system I'm not able to have the same functionality, because with default_charset=iso-8859-1 only the ODBC applications works well but MySQL's show error like ° as ° or ò as ò, and if I set default_charset=UTF-8 MySQL apps works well and with ODBC ° became �, so I'm not able to get strings from AS400 and save them on MySQL.

My applications are very simple (no framework involved) so PHP code on both systems are the same (I will improve it later).

I think I'm losing some configuration: some suggestion?

Thanks

PS: please let me know if there are other informations that can be useful to know to catch a solution.

EDIT

As punctual solution I can, having default-charset=utf-8, use utf8_encode with strings from ODBC, but before change all the code I will try to find a global solution.

EDIT 2: applications

As requested I give some additional informations on my app, if useful.

  • AS400 db is managed by an ERP, so I use it only in read mode. Some apps send query via PDO and display results (eg Sales).
  • MySQL db is used to store data not managed by ERP (eg Quality). Apps do CRUD operations via PDO using data inserted by users.
  • Mixed apps retrive some data from AS400 (eg Products), some data from users and save on MySQL.
  • "Mirror" tables are created on MySQL to contain views of AS400 tables (for better response). An app periodically get data from AS400 and store them on MySQL.

EDIT 3: on mysql driver/connection/configuration

After doing a test with the same code having an echo '°' and a print_r($mysql_data) with these results:

  • actual: echo=>° print_r=>°
  • new: echo=>° print_r=>°

I am convinced that the "problem" is on the MySQL connection, maybe something I had set and don't remember :-(

mYmage
  • 92
  • 2
  • 14
  • php should be UTF-8 - that's been the default since v5.6, – Nikkorian Aug 19 '20 at 06:39
  • iso-8859-1 is not unicode - it is a 1byte charset. That should not be present any where in your data path if you need unicode capability – Nikkorian Aug 19 '20 at 06:42
  • @Nikkorian I "know" that use of iso-8859-1 is wrong, but with this set on the old system all goes well. Probably it is a combination of "wrong" configuration that result in a worging set but I don't know how. – mYmage Aug 19 '20 at 07:36
  • That's not wrong if your data's in ISO-8859 to start with (à°€ò can all be ISO-8859). Can you be more specific re the architecture of the applications? You have a Web app in php, data from mysql via PDO. What else is going on? What app, what language,does the AS400 - Mysql data transfer? My reading atm: the old system uses all ISO-8859, but now you're passing ISO-8859 strings into an environment expecting UTF-8. Try putting everything (ODBC, MYSQL, PHP) to ISO-8859, then use mb_convert_encoding($data,'UTF-8') as you put data into a web page. The browser will expect UTF8. – Nikkorian Aug 19 '20 at 13:50
  • I just saw your first edit. Frankly that looks like a good solution. ISO-8859 from AS400 thru' ODBC to MYSQL - convert to UTF-8 before storing - then UTF-8 all the way to the web page. Perfect. Assuming your AS400 data is in ISO-8859, that conversion to UTF-8 has to happen somewhere, and these days having MYSQL and PHP in UTF-8 is standard practice. (Actually, MYSQL in UTF8mb4 is preferred because MYSQL's UTF-8 is not a true unicode char set ). – Nikkorian Aug 19 '20 at 13:58
  • @Nikkorian the "problem" is that in actual enviroment, even if not well configured, all goes well without conversion like utf8_encode, and pages (special character) look right whether it reads from AS400 or MySQL. I'm looking for a suggestion to check in the old enviroment that do the "magic" :-) – mYmage Aug 19 '20 at 14:10
  • Your old MYSQL db is probably using Latin1 (tables/columns), now you're using UTF-8. If that's true, the ISO-8859 data would have been OK in the old MYSQL, but not now? So start by comparing the MYSQL – Nikkorian Aug 19 '20 at 14:24
  • Good luck - I have to go - I'm in AUS and it's late! – Nikkorian Aug 19 '20 at 14:30
  • See this for what caused the "Mojibake" and "black diamond": https://stackoverflow.com/questions/38363566/trouble-with-utf8-characters-what-i-see-is-not-what-i-stored – Rick James Aug 19 '20 at 18:16
  • @Nikkorian I checked connection, db, table and field collation for both Mysql and are the same utf8mb4_unicode_ci. – mYmage Aug 20 '20 at 07:59
  • @RickJames Thanks for reference. I'm not able to check all settings (I don't know how). I used the same code (with an echo '°' and a print_r($mysql_row) ) checking the only thing that I know, default_charset of php, at the beginning and it is the same (iso-8859-1). In the actual server no Mojibake on MySQL data, in the new it appens; on both echo '°' print °. – mYmage Aug 20 '20 at 08:05
  • @mYmage - What do you get from the `SELECT HEX(col)` suggested in that link? `°` is Mojibake for `°`. Latin1 hex for `°` is `B0`; utf8/utf8mb4 is `C2B0`; "double-encoded" is `C382C2B0`. – Rick James Aug 21 '20 at 00:28
  • @RickJames with print_r I got C2B0 on both servers. – mYmage Aug 22 '20 at 07:05
  • @RickJames For the check: The bytes to be stored need to be UTF-8-encoded: when I look with PhpMyAdmin I see °, it is right? The connection when SELECTing text needs to specify utf8 or utf8mb4: how to check this? The column needs to be declared CHARACTER SET utf8 (or utf8mb4): ok. HTML should start with : ok. – mYmage Aug 22 '20 at 07:26
  • @mYmage - Run this in phpmyadmin `SELECT UNHEX('C2B0'), UNHEX('C382C2B0'), UNHEX('B0');` – Rick James Aug 22 '20 at 16:04
  • @RickJames on both system returns °, °, 0xb0 – mYmage Aug 23 '20 at 15:26

1 Answers1

0

I report here the solution in case other will face with the same problem. Thanks to a frend that linked me to a MySQL 8 vs 5.7 difference (https://dev.mysql.com/doc/refman/5.7/en/charset-connection.html#charset-connection-error-handling) on handling charset ("The default collation for utf8mb4 differs between MySQL 5.7 and 8.0"), I replaced the v.8 with v.5.7 and all went well.

Ok, this is not a "solution" in general, but only a confirm that the problem is in the "configuration" of MySQL 8 / PHP interconnection.

mYmage
  • 92
  • 2
  • 14