4

When I try to convert data from latin1_swedish_ci to utf8_unicode_ci I loose data ! The TEXT column is cut at the first special character.

For example: enter image description here Becomes: enter image description here

Yet I tried many ways to convert my column and all solutions end up deleting data at the first special character!

I tried by phpMyAdmin or with this SQL request:

UPDATE `page` SET page_text = CONVERT(cast(CONVERT(page_text USING latin1) AS BINARY) USING utf8);

I also tried the php script :

https://github.com/nicjansma/mysql-convert-latin1-to-utf8/blob/master/mysql-convert-latin1-to-utf8.php

With all the time the same result, data are lost at first special character!

What should I do?

UPDATE

I could change the data to utf8 with

ALTER TABLE page CONVERT TO CHARACTER SET utf8mb4;

or

ALTER TABLE page CONVERT TO CHARACTER SET utf8;

without loosing data but it does not display properly special characters.

Using the php function utf8_encode($myvar); does display correctly special characters.

London Smith
  • 1,622
  • 2
  • 18
  • 39

1 Answers1

2

To convert a table, use

ALTER TABLE ... CONVERT TO ...

Or, to change individually columns, use

ALTER TABLE ... MODIFY COLUMN ...

Instead, you seem to have done something different. For further analysis, please provide SELECT col, HEX(col) ... before and after the conversion, plus the conversion used.

See "truncated" in this . The proper fix is found here, but depends on what you see from the HEX.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • Strange, I could change the data to utf8 with `ALTER TABLE page CONVERT TO CHARACTER SET utf8mb4;` without loosing data but it does not display properly data and the php function `utf8_encode($page["page_text"]);` does display correctly special characters. – London Smith Jul 27 '19 at 15:48
  • Don't use any encode/decode routines; they only compound the problem. Provide the HEX so we can figure out what is wrong. – Rick James Jul 27 '19 at 18:48
  • Do you mean select hex(column) from table ? `select hex(page_text) from page` is very very long. – London Smith Jul 29 '19 at 09:57
  • @LondonSmith - Use `LEFT()` or `MID()` to isolate a chunk that includes some. accented text. For example, if `col` has some accents in the first 20 chars do `SELECT HEX(LEFT(col, 20)) ...` – Rick James Aug 01 '19 at 04:18