0

My data in one of my Database columns is not displaying correctly from UTF8.

I recently converted the MySQL database to UTF8 using:

ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;

When displaying data using my PHP script one field appears to not display correctly:

£25

When Im expecting:

£25

Other fields on the same page correctly display £25. The only difference I see is that they are VARCHARs and the problem field is MEDIUMTEXT.

Why does this one field not display correctly (it isn't treated any differently by my script).

Any help ?

PapaLazarou
  • 1,073
  • 2
  • 9
  • 12
  • possible duplicate of [UTF-8 all the way through](http://stackoverflow.com/questions/279170/utf-8-all-the-way-through) – Marcus Adams Mar 26 '15 at 18:44

2 Answers2

0

You can't just convert the collation like that and not expect issues. There are many problematic characters that are not converted automatically.

What's happened is you likely had a non-UTF8 friendly character (like those MS Office produces) and when you converted it got mangled. If it's just a few fields, I would open them in an editor and remove the offending data. Otherwise you may have to "scrub" your database of these strange characters using a loop and something like utf8_encode

Machavity
  • 30,841
  • 27
  • 92
  • 100
0

What was the CHARACTER SET of the column before converting?

A3 is the hex for latin1 £.

C2A3 is the hex for utf8 £, but it displays as £ if treated as latin1.

Possibly you only need to declare that your php program is using utf8:

⚈  (deprecated) mysql interface: mysql_set_charset('utf8'); (assuming PHP 5.2.3 & MySQL 5.0.7) Note: "mysql_" interface is deprecated.
⚈  mysqli interface: mysqli_set_charset('utf8') function. See mysqli_set_charset
⚈  PDO interface: set the charset attribute of the PDO dsn.

Please do SELECT HEX(col) FROM tbl WHERE .... If you get C2A3..., then you have correctly converted the table to utf8. If you get C382C2A3..., you have the dreaded "double encoding". I discuss that further in my blog.

Since you seem to have different fields showing different things, you are likely to have different encodings. Beware, things are getting harder and harder to untangle.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • Before Character set was "latin1;" Now it is: "DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;" I ran the HEX on the column I get for "£25"= C382C2A33235 So its some kind of double encoding then ? How do I fix it... shall revert my databse back again to Latin 1 ? – PapaLazarou Mar 27 '15 at 00:26
  • Well, that probably means that the data in the table (when it was latin1) was wrong! Can you start over -- that is, load the data fresh? If so, let's first talk about establishing "utf8" end-to-end. If not, we'll need to dig deeper into the various clients, tables, etc. – Rick James Mar 27 '15 at 05:29
  • I solved the problem Im in UTF8 now. There was two problems one was double encoded field and second was the field was being changed with the PHP dom call $html->saveHTML(). – PapaLazarou Mar 29 '15 at 17:07