I'm having a problem with the webpage inconsistently not showing characters correctly for user-generated posts, such as showing ’ instead of ', for example. However, this doesn't happen all the time, some posts from users show the correct apostrophe, but others do not:
I’m having trouble and can’t figure it out
My server is running on PHP 8.1 and MySQL 8.0
Here are some of my config setups:
php.ini:
default_mimetype = "text/html"
default_charset = "UTF-8"
MySQL Variables:
character set clientDocumentation utf8mb4
character set connectionDocumentation utf8mb4
character set databaseDocumentation utf8mb4
character set resultsDocumentation utf8mb4
character set serverDocumentation utf8mb4
character set systemDocumentation utf8mb3
character sets dirDocumentation /usr/share/mysql/charsets/
collation connectionDocumentation utf8mb4_0900_ai_ci
collation databaseDocumentation utf8mb4_0900_ai_ci
collation serverDocumentation utf8mb4_0900_ai_ci
default collation for utf8mb4Documentation utf8mb4_0900_ai_ci
All databases, tables and fields have a collation set to: utf8mb4_0900_ai_ci
My HTML has the following header on all pages:
<meta http-equiv="content-type" content="text/html;charset=utf-8" />
My PHP has the following header on all pages:
header("Content-Type: text/html; charset=utf-8");
Again, these are user-generated posts. I'm not sure why it's happening, most of the users are based in the US, so I don't think it's due to international keyboards. I am hearing that it could be related to Windows or Mac devices, or users copying/pasting text from Word into the form field? Is there supposed to be some kind of PHP conversion required before saving every post to the database? Or is there something I'm supposed to be doing with PHP to display the text correctly? What is the standard practice here?
UPDATE: Added an example of PHP/MySQL code for querying and printing text to the webpage:
$conn = mysqli_connect($hostname, $username, $password, $database) or die("Database Connection Error");
$result = mysqli_query($conn, "SELECT field FROM table WHERE id = 123") or die(mysqli_error($conn));
$row = mysqli_fetch_assoc($result);
$message = htmlspecialchars($row['field'], ENT_QUOTES);
echo $message;
UPDATE: Adding more information found:
mysqli_set_charset($conn,"utf8mb4");
var_dump(mysqli_get_charset($conn));
object(stdClass)#2 (8) {
["charset"]=> string(7) "utf8mb4"
["collation"]=> string(18) "utf8mb4_general_ci"
["dir"]=> string(0) ""
["min_length"]=> int(1)
["max_length"]=> int(4) ["number"]=> int(45)
["state"]=> int(1) ["comment"]=> string(13) "UTF-8 Unicode"
}
It seems PHP is using "utf8mb4_general_ci" as the collation, which is different than the default global variables set in the MySQL config. I tried setting the collation using the following below, but that did not change this:
mysqli_query($conn, 'SET collation_connection = utf8mb4_0900_ai_ci');
mysqli_query($conn, 'SET SESSION collation_connection = utf8mb4_0900_ai_ci');
Also, I discovered another MySQL table: performance_schema.session_variables , with the following:
character_set_client = utf8mb4
character_set_connection = utf8mb4
character_set_database = utf8mb3 // different than globals variable
character_set_filesystem = binary
character_set_results = utf8mb4
character_set_server = utf8mb4
character_set_system = utf8mb3
collation_connection = utf8mb4_unicode_ci // different than globals variable
collation_database = utf8mb3_general_ci // different than globals variable
collation_server = utf8mb4_0900_ai_ci
I think this might be related to the problem. I tried to update these variables, but even the admin user is getting "permission denied". I cannot update these, nor do I know why these are being set differently than the global variables. I even rebooted MySQL and there is no change.