I've done a very few web PHP/MySQL based web projects, and sooner or later, I always have charset issues.
I live in Spain, and we have some special characters over here: ç ñ á é í ó and ú
There are so many variables, that the charset always gets messed up:
- MySQL database collation
- PHP/HTML headers
- Web browser codification settings
- PHP settings
- Apache settings
What I would like to have is a basic guideline on how to setup everything, so that I don't have issues with these Spanish characters.
There are three types of ways I populate my HTML output:
- I query the MySQL database with PHP, and echo the output.
I write some words directly with HTML, for example
<p>Qué rábanos pasaría mañana</p>
I read a labels.ini file with parse_ini_file($file); The label file looks something like:
SORTING_ENTITY = Línea de negocio SORTING_PLURAL = Líneas de negocio MAIN_ENTITY = Instalación MAIN_PLURAL = Instalaciones
So when I view the website, sometimes the texts generated from MySQL are messed up, other times the direct HTML is messed up, and other times everything is okay, but the content coming from the .ini file is messed up.
Also sometimes, I use web forms, so that the users input data that is saved in MySQL. The users write for example "Pájaro" in the web form, and some incorrect chars are stored in the database like "P}jaros" or something like that.
I would like to have some guidelines, so that everything is setup in a way that whatever I write in direct HTML or .ini file is shown in the website, and whatever the users writes in the web form is stored correctly, and also displayed in the same way when later reading this data and echoing with PHP.
I don't want to be using stuff like:
á
ñ
echo utf8_encode($dat);