3

I have a php file that is saved with UTF-8 encoding.

The php file is in a local Apache linux server.

The html generated with the php file starts with <!DOCTYPE html> and has <meta charset="UTF-8"> inside the <head> section.

In the root directory of the server there is a .htaccess file with AddCharset UTF-8 .php on it.

In the /etc/apache2/conf-available/charset.conf the line AddDefaultCharset UTF-8 is uncommented. I did sudo service apache2 restart after the modification of charset.conf and after creating the .htaccess file.

The browser still shows the wrong character set. In the text editor there is the word diámetro, Chrome 55 shows diámetro.

What else should I try?

Greetings from Paraguay.

Martin
  • 22,212
  • 11
  • 70
  • 132
  • 1
    you need to save the html file itself as a UTF-8 in your text editor then. Which text editor? I recommend [Notpad++](https://notepad-plus-plus.org/) (it's free) – Martin Jan 01 '17 at 13:47
  • I'm pretty sure the issue is your text editor file is not being saved as a proper UTF-8, and so regardless of what the *contents* of that file says (html5 meta tag) it's still not going to magically turn into UTF-8, if it's not saved as a UTF-8. – Martin Jan 01 '17 at 13:50
  • 1
    The file is already encoded with UTF-8. I use Netbeans. – Claudio Bogado Pompa Jan 01 '17 at 13:50
  • I uploaded the file in [link](https://www.a1ci.com/articulos/varillas.php). – Claudio Bogado Pompa Jan 01 '17 at 13:56
  • 1
    Thanks. I looked at the file (`varillas.php`) and it has the correct HTTP headers as well as the correct HTML5 meta header. But looking at the source code myself, I see what is displayed on the page (`diá` etc) so the issue is -I still think- your NetBeans *saving* the file. Save the file in netbeans with `Save As` and be sure to select the correct UTF-8 charset. – Martin Jan 01 '17 at 14:03
  • I've updated my answer, with possible solutions and what I believe the problem is. Let me know if this helps you. – Martin Jan 01 '17 at 14:23
  • err, I've just read your comment above re: UTF-8, how do you know the file is encoded with UTF-8? My Netbeans (8.1) I can find no info at all about what the file encoding actually is – Martin Jan 01 '17 at 14:33

2 Answers2

0

Ok, this is actually a pretty interesting issue, and from comments with the OP, and viewing the file itself online; the issue is not with the server, or the HTTP headers which, correctly set as UTF-8.

HTTP HEADER

Content-Type: text/html; charset=UTF-8

HTML5 HEAD

<meta charset="UTF-8">

Looking at the source code of this file the issue appears more and more to be that the program used to write the file (NetBeans) is not saving it with a correct UTF-8 file encoding.

Because the source file is NOT UTF-8, it means any references to HTTP headers or HTML5 Head tags stating to use UTF-8 will not work, because they're trying to convert milk to wine, which you can only do if you start with wine [or at least, grapes]!

So; with some research I've found to my surprise that (as of NetBeans 8.1 / windows 10) you can't set the character set of a code file, Netbeans can set the project character set, but that's a little tricky. There is a guide this this thread (See the answer by Danny) (and another here) about how to do this.

There is also a video guide to the above solution here.

Also see some NetBeans bug reports, some of which are 6 years old and a far from ideal situation... It appears NetBeans uses your OS default character encoding, such as usually Windows-1252 or similar. These can be less compatible with internet standard encodings (such as UTF-8), as I think you're finding here.

So I can only really say that if this was my choice, I would stop using NetBeans, and move over to something more reliable such as NotePad++.

You can open the offending file (varillas.php) in another program (NotePad++) and check the characters displayed on the page are the correct ones (diámetro rather than diámetro) and then Save As a UTF-8 encoded file, then reupload the file and Hard refresh your browser (to clear the browser cache) and see if this fixes your issue.

Community
  • 1
  • 1
Martin
  • 22,212
  • 11
  • 70
  • 132
  • The file is UTF-8, I did verify the encoding in notepadqq in linux and notepadd++ in windows. I uploaded the php file and echo the contents without the html headers and it shows the characters correctly. – Claudio Bogado Pompa Jan 01 '17 at 15:36
  • Ah ok, it seems odd that the HTML5 head meta tag doesn't seem to work but glad you got it solved `:)` – Martin Jan 01 '17 at 15:56
  • The only other thing I can think of is that the file has a Byte Order Mark which can cause issues, [read more here](http://stackoverflow.com/a/6080299/3536236), but the HTML5 meta header in itself shouldn't be an issue. I wonder if somehow its conflicting with the `es` declaration in the HTML tag? – Martin Jan 01 '17 at 15:58
  • I already tested with `` without `lang="es"` with same result. – Claudio Bogado Pompa Jan 01 '17 at 16:06
0

There was a utf8_encode command in the code that I was missing and was the culprit.
Thank you for your answers.

utf8_encode was used because the HTML body was previously loaded from an ISO-8859-1 source. Then I modified the code to load from UTF-8 source and forgot to delete.

Martin
  • 22,212
  • 11
  • 70
  • 132
  • Glad you found it, could you elaborate a bit more on the `utf8_encode`? I don't recognise it. If it's the PHP function, I'm having trouble seeing where it would fit in the work flow.... – Martin Jan 03 '17 at 19:11