0

I'm building a small website using PHP and MySQL.

I have a problem reading from a database. I can do all operations on DB, but when I SELECT from DB, I get some characters invalid or in wrong representation.

I get:

Koriste?i kriminalisti?ku pri?u Dostojevski je sastavio roman kojim je predo?io svu slo�enost u odnosu pojedinca prema postoje?im zakonima.

but in a database I see:

Koristeći kriminalističku priču Dostojevski je sastavio roman kojim je predočio svu složenost u odnosu pojedinca prema postojećim zakonima.

The problem is with Croatian letters: Č Ć Ž Đ Š. I think the problem is with collation of a database.

Any idea ?

Dinko Pehar
  • 5,454
  • 4
  • 23
  • 57

1 Answers1

2

Non-English characters are a headache, but in the past few years, dealing with them has been made easier. First things first, make sure your varchar and text column collations in your database are set to utf8mb4_unicode_ci.

You'll then need to make sure that when you query your database, you're querying it in UTF8. Run the following query after connecting to the mySQL server from PHP, but before running any other query:

SET NAMES utf8mb4;

Finally, make sure the page you're displaying the text is also UTF8 ready. Most browsers default to it nowadays, but just to be on the safe side, set the following tag in your HTML header:

<meta charset="utf-8">

If you're using mysqli, perhaps the below snippet is helpful:

# Set up the mysqli driver
$driver = new mysqli_driver();

# Report all errors
$driver->report_mode = MYSQLI_REPORT_STRICT | MYSQLI_REPORT_ERROR;

# Try connecting
try {
    $mysqli = new mysqli($_ENV['db_servername'],$_ENV['db_username'], $_ENV['db_password'], $_ENV['db_database']);
}
catch(mysqli_sql_exception $e) {
    if ($mysqli->connect_error) {
        die("[Error #{$mysqli->connect_errno}] {$mysqli->connect_error}");
    } else {
        die($e->getMessage());
    }
}

# Ensure everything is UTF8mb4
$mysqli->set_charset('utf8mb4');

# Ensure PHP and mySQL time zones are in sync
$offset = (new \DateTime())->format("P");
$mysqli->query("SET time_zone='$offset';");
dearsina
  • 4,774
  • 2
  • 28
  • 34
  • It worked when adding `$conn->set_charset('utf8mb4');` below the `$conn = new mysqli($servername, $username, $password, $dbname);`. Thank you. I tried changing to different collations when creating database using MySQL workbench but it didn't help. – Dinko Pehar Jan 22 '19 at 12:40
  • 1
    The DB collation is not as important as the column collations. – dearsina Jan 22 '19 at 12:44
  • @DinkoPehar - `utf8mb4` is a "character set", not a "collation". And "utf8" should have worked equally well _for Croatian_. – Rick James Jan 26 '19 at 01:09
  • @RickJames I tried changing collation to man other types and it didn't work. When I set charset through php it worked. I didn't know how to ask a question since letters worked in plain html, but I couldn't read it properly from a database. – Dinko Pehar Jan 26 '19 at 11:18
  • @DinkoPehar There are about 4 places where you need to specify utf8mb4 -- PHP, Columns, HTML, `
    `, etc. When talking to MySQL, use `utf8mb4` (or `utf8`); when talking to other products, use `UTF-8` (with a dash).
    – Rick James Jan 26 '19 at 16:48