1

I'm trying to parse text data to a table in mySql and later on an SQL server. This text contains special characters such as: . When the bit of text is still on PHP it's all fine but as soon as I use utf8_encode(teststring); it loses the character and changes it to â.

The only workaround that I found was to decode the string again when getting it back from the database but this is very inefficient when handling alot of data. Is there any work around?

Already tried utf8mb4 enconding and many other answers on the internet.

Example:

$test = "some string containing ⓜ";
print_r($test."</br>");
$test = utf8_encode($test);
print_r($test."</br>");
$test = utf8_decode($test);
print_r($test."</br>");

Result:

some string containing ⓜ
some string containing â
some string containing ⓜ

If I do not use the utf8_encode() I get parsing errors from the database: General error: 1366 Incorrect string value

Akorna
  • 217
  • 2
  • 16
  • 1
    What character set is the table you are trying to store into? With special characters it always helped me to have EVERYTHING in same encoding – Pavel Janicek Dec 20 '16 at 13:23
  • Did you read [UTF-8 all the way through](http://stackoverflow.com/questions/279170/utf-8-all-the-way-through)? – simon Dec 20 '16 at 13:25
  • Put everything in UTF8 because I was told (by the internet) that was the best option when I'd want to store characters like these in plain text. But the thing is, it already removes the character before storing it in the database :/ – Akorna Dec 20 '16 at 13:25
  • @simon Been through it and can't imagine I missed out on one of the steps needed. As precised in the code-sniplet above, the problem is already present when encoding to UTF8 in PHP – Akorna Dec 20 '16 at 13:29
  • What is the character set of your MySQL column? Is your SQL Server column an `NVARCHAR` column? When you insert into SQL Server are you doing `N'text'` and not just `'text'`? – O. Jones Dec 20 '16 at 13:38
  • As said before, the character is already lost before entering it anywhere in the database. I don't think the issue lies with anything above. But they are set to `nvarchar(MAX)` – Akorna Dec 20 '16 at 13:57
  • `utf8_encode()` encodes from ISO-8859-1, a single byte encoding that can by no means encode most Unicode characters. That doesn't have the least chance of working. – Álvaro González Dec 20 '16 at 15:23

1 Answers1

1

Someone in the company made a mistake and one lost file wasn't encoded like it should so basicly the text wasn't in UTF8.... sorry for taking everyone's time.

Akorna
  • 217
  • 2
  • 16
  • For whoever googles here: if your entire toolchain uses UTF-8 you don't need any manipulation at all, esp. stuff like `utf8_encode()` that everybody uses incorrectly. – Álvaro González Dec 21 '16 at 15:07