File enconding (UTF-8 not working properly)

Question

In my webpage, there is a form with multiple inputs. However, the input chars behave differently from the input "label" chars. I tried setting the file encoding to UTF-8 and UTF-8 +BOM (I'm using EditPlus).

Using UTF-8:

enter image description here

Using UTF-8 + BOM:

enter image description here

The input chars come from a mysql database where the collation is utf8_unicode_ci (using phpmyadmin) so i don't know if that's the problem's source. Any ideas?

Please read http://stackoverflow.com/questions/279170/utf-8-all-the-way-through and see if that fixes your problem. — Danack, Jun 20 '13 at 11:48

score 1 · Answer 1 · edited May 23 '17 at 12:13

This means both pieces of data are not in the same encoding. If the file is interpreted as Latin-1 (or a similar encoding), you get the first result in which the data in the input field is valid (meaning it's Latin-1 encoded) but the label is wrong (meaning it's not Latin-1 encoded). When the file is interpreted as UTF-8, the label is correct (meaning it's UTF-8 encoded) but the data in the input field is wrong (meaning it's not UTF-8 encoded). If data shows up as the � UNICODE REPLACEMENT CHARACTER, it's a sure sign the document is being interpreted as a Unicode encoding (e.g. UTF-8), but the byte sequence is invalid.

I'll guess that the label is hardcoded in the file but the data in the input field comes from a database. In this case you need to set the connection encoding for the database to return UTF-8.

As to why the file is interpreted in Latin-1 without BOM and in UTF-8 with BOM: because the browser recognizes the BOM as signifying UTF-8, without it it defaults to Latin-1. You need to set the correct HTTP header to tell the browser what encoding the file is in, and get rid of the BOM.

Read these resources:

score 0 · Accepted Answer · answered Jun 27 '13 at 11:12

0

solved it: Just changed the file enconding to "Western European (Windows) 1252" (using EditPlus) and now every character is correctly shown.

answered Jun 27 '13 at 11:12

Correia JPV

610
3
10
25

File enconding (UTF-8 not working properly)

2 Answers2