4

I'm trying to convert a string from this: “é” to this: “é”. It's a latin1 character but I can't do it right. So far I've tried two functions but none of them give me the right output.

$translation = 'Copà © rnico was Italian';
$translation = mb_convert_encoding($translation, 'utf-8', 'iso-8859-1'); //opt 1
$translation = iconv('utf-8', 'latin1', $translation); //opt 2

I'm getting this data from an Api so I don't know what's going on in the database. This is the string in Spanish: Copérnico es italiano. This is the data from the API: Copà © rnico is Italian This is the result with $translation = bin2hex($translation); 436f70c38320c2a920726e69636f206973204974616c69616e

What's the right way to go? Greetings.

Diego
  • 561
  • 1
  • 7
  • 18
  • Have you tried `utf8_encode()` ? – Naveed Ramzan Apr 10 '18 at 14:16
  • 8
    Maybe it's already UTF-8 with bad symbols? – Justinas Apr 10 '18 at 14:16
  • 3
    Possible duplicate of [UTF-8 all the way through](https://stackoverflow.com/questions/279170/utf-8-all-the-way-through) – CD001 Apr 10 '18 at 14:20
  • 1
    If you see [mojibake](https://en.wikipedia.org/wiki/Mojibake), that means you're not interpreting some bytes using the correct charset. The problem is either that you're merely mis-treating the correct bytes, or that you have messed up data which expresses literally the letters "Ã ©". It's impossible to know which it is with the given information. Start with an `echo bin2hex($translation)` to see what *bytes* the string contains, based on that figure out what charset it is, and then figure out what charset you *want*. – deceze Apr 10 '18 at 14:24
  • @NaveedRamzan Yes and it adds more characters: $translation = utf8_encode($translation); Copà © rnico was Italian – Diego Apr 10 '18 at 14:46
  • @deceze I'm getting this data from an Api so I don't know what's going on in the database. This is the string in spanish: Copérnico es italiano This is the data from the API: Copà © rnico is Italian This is the result with $translation = bin2hex($translation); 436f70c38320c2a920726e69636f206973204974616c69616e I can't figure it out what's the charset so any guideline from here would be great. Thanks. – Diego Apr 10 '18 at 14:55
  • 1
    Put that information into your question. – deceze Apr 10 '18 at 14:58
  • 2
    I punched re-open but I don't think it's going to go all the way to getting reopened. You need to track down what wrote it wrong and why before you try to kludge-fix it here. Only on knowing the exact incorrect transform can a reversal be given. – Joshua Apr 10 '18 at 17:11

1 Answers1

4

I had the same problem before and this option

$translation = iconv('utf-8', 'latin1', $translation); //opt 2
work verry well. Your problem is `Copà © rnico was Italian` is not the same than `Copérnico was Italian`.

So when you try to convert the function iconv see 2 wrong UTF-8 symbols because de spaces, is not the same "à © "(2 invalid UTF-8 symbols and 2 spaces) than "é"(1 Valid UTF-8 symbol)