1

Since I started working with Apache, PHP/HTML/CSS/JS and databases such as MySQL, I always had to deal with latin charset problems.

I mean, starting from the database until the views are displayed in the browser screen, through all the many instances in which charset can be altered, I never had the chance of working with no problems, except with a Wamp/Xampp installation in localhost with Windows.

I always tried to set all charsets/encoding/collation in every place I could, but I always had different results, depending on the server (Linux, Windows).

I always check:

1) Database collation.

2) Table collation.

3) Column collation.

4) Apache charset configuration in httpd.conf

5) PHP charset configuration in php.ini

6) Individual php/html file encoding.

7) Addition of meta tag in every html with same encoding/charset.

8) Addition of "header(... charset...)" with same encoding/charset;

But in many circumstances, I keep on receiving unrecognized chars instead of accents and special latin symbols. Sometimes it works ok in the screen, but I find problems with generated CSV/Excels, or as a JSON response from a webservice, etc.

I checked other posts like this one: PHP Character encoding problems

I try to follow the tips, but I can't fix the problem in certain cases.

Tired of trying lots of things I always end up using functions such as utf8_encode / decode, or iconv(...). Sometimes I even get the desired result only by using both utf8_encode/decode, one inside the other. Horrible.

Is there a tidy and easy way to resolve this without having to use those scary functions that leave the code untidy? Is the Operating System an issue? I tend to see more charset problems while running systems in Linux servers.

  • 2
    Dont use the latin character set. Use utf8. – user3783243 May 24 '18 at 14:28
  • 2
    You keep mentioning that you have issues, but you don't really tell us exactly what the issues are. I mean, "encoding problem" can mean a lot of different things and can have equally, or even more, different reasons. You should set _all_ tables and columns as UTF-8. You should also make sure that all your files (php-files etc) are stored in UTF-8. – M. Eriksson May 24 '18 at 14:29
  • 1
    Yes to add to @MagnusEriksson 's comment the files (php-files etc) needs to be saved without byte order mark (BOM).. And make sure mysqli and pdo are using the uft8 charset it defaults into using latin1 charset. – Raymond Nijland May 24 '18 at 14:32
  • sounds like you're doing everything correctly. can you give us a sample dodgy json you have a problem with? – delboy1978uk May 24 '18 at 14:38
  • [Handling Unicode Front To Back In A Web App](http://kunststube.net/frontback/) – deceze May 24 '18 at 14:54
  • 1
    In a nutshell: you never want to *convert* encodings unless absolutely necessary (usually it is *not* necessary if you're responsible for everything). So you never want to see `utf8_en/decode`, `mb_convert_encoding` or `iconv` anywhere in your code. If you get text in the wrong encoding from some external system (browser, database), there's a way to change that to receive text in the encoding you desire (HTTP headers, meta tags, database connection charset settings). – deceze May 24 '18 at 14:58

1 Answers1

0

It's always a good way, to set all of your encodings to the same value, otherwise, you're dealing with encoding issues.

Database (specifically MySQL)

With MySQL for example, you're good with utf8 or utf8mb4. Whatever fits your application better, can you read here: What is the difference between utf8mb4 and utf8 charsets in mysql?. But you should set it for every database, table, and column. You shouldn't mix them up.

Set the connection encoding of MySQL to UTF-8. You can do this manually with SET NAMES utf8.

mysqli

<?php

$mysqli = new mysqli("localhost", "my_user", "my_password", "test");
$mysqli->set_charset("utf8")

^1: mysqli_set_charset - PHP Docs

PDO

<?php

$db=new PDO("mysql:host=localhost;dbname=database", "username", "password");
$db->setAttribute(PDO::MYSQL_ATTR_INIT_COMMAND, "SET NAMES utf8");

^2: PDO::setAttribute - PHP Docs

Editor

Further, it also depends on the encoding you're using within your editor or IDE, respectively the encoding the file gets saved in. Most modern IDEs do use utf8 as default, but some editors like Notepad++ need some extra configuration.

Templates

Specify the character encoding for the HTML document with the charset-metatag:

<meta charset="UTF-8">
Dan
  • 5,140
  • 2
  • 15
  • 30