46

I don't see anything illegal - any suggestions on what might be the problem?

    if (strtolower($matches[1]) != 'utf-8') {
        var_dump($matches[1]);
        $xml = iconv($matches[1], 'utf-8', $xml);
        $xml = str_replace('encoding="'.$matches[1].'"', 'encoding="utf-8"', $xml);
    }

Below is my debug/error

string(12) "windows-1252"
Notice (8): iconv() [http://php.net/function.iconv]: Detected an illegal character in input string [APP/models/sob_form.php, line 16]

I've verified that the above code is indeed line 16

Ben
  • 60,438
  • 111
  • 314
  • 488

7 Answers7

54

If you used the accepted answer, however, you will still receive the PHP Notice if a character in your input string cannot be transliterated:

<?php
$cp1252 = '';

for ($i = 128; $i < 256; $i++) {
    $cp1252 .= chr($i);
}

echo iconv("cp1252", "utf-8//TRANSLIT", $cp1252);

PHP Notice:  iconv(): Detected an illegal character in input string in CP1252.php on line 8

Notice: iconv(): Detected an illegal character in input string in CP1252.php on line 8

So you should use IGNORE, which will ignore what can't be transliterated:

echo iconv("cp1252", "utf-8//IGNORE", $cp1252);
NobleUplift
  • 5,631
  • 8
  • 45
  • 87
  • 8
    I get the same notice even when I put "//IGNORE" on both sides – Erel Segal-Halevi Dec 31 '13 at 02:59
  • 5
    What do you mean on both sides? – NobleUplift Jan 02 '14 at 14:50
  • And @ErelSegal-Halevi, I would like to see your code. – NobleUplift Apr 29 '15 at 20:36
  • @Mantas But Erel was replying to the `//IGNORE` text in my answer, which is why I was confused by your praising of him. – NobleUplift Apr 30 '15 at 15:24
  • It was a long time ago, but from what I remember, my code was something like: echo iconv("cp1252//IGNORE", "utf-8//IGNORE", $cp1252); – Erel Segal-Halevi May 03 '15 at 05:19
  • That might explain it. You can't add flags to the in_charset of iconv, but you're right; this question is pretty old lol. Good thing I love necroposts. – NobleUplift May 04 '15 at 16:46
  • In my case, using `//IGNORE` seems to delete the entire string? The hex values of the string: `4f 62 65 72 6b 72 c3 a4 6d 65 72` (= "Oberkrämer"), which becomes empty if I use `iconv(mb_detect_encoding($string), 'ISO-8859-1//TRANSLIT', ($string));` – Juha Untinen Jun 13 '16 at 09:06
  • @JuhaUntinen Can you debug/output the result of `mb_detect_encoding($string)`? And you say `//TRANSLIT` is resulting in an empty string, or `//IGNORE`? You also don't need to put the second instance of `$string` in parentheses. – NobleUplift Jun 13 '16 at 16:39
38

The illegal character is not in $matches[1], but in $xml

Try

iconv($matches[1], 'utf-8//TRANSLIT', $xml);

And showing us the input string would be nice for a better answer.

Ranty
  • 3,333
  • 3
  • 22
  • 24
19

BE VERY CAREFUL, the problem may come from multibytes encoding and inappropriate PHP functions used...

It was the case for me and it took me a while to figure it out.

For example, I get the a string from MySQL using utf8mb4 (very common now to encode emojis):

$formattedString = strtolower($stringFromMysql);
$strCleaned = iconv('UTF-8', 'utf-8//TRANSLIT', $formattedString); // WILL RETURN THE ERROR 'Detected an illegal character in input string'

The problem does not stand in iconv() but stands in strtolower() in this case.

The appropriate way is to use Multibyte String Functions mb_strtolower() instead of strtolower()

$formattedString = mb_strtolower($stringFromMysql);
$strCleaned = iconv('UTF-8', 'utf-8//TRANSLIT', $formattedString); // WORK FINE

MORE INFO

More examples of this issue are available at this SO answer

PHP Manual on the Multibyte String

Mike Casan Ballester
  • 1,690
  • 19
  • 33
16

PHP 7.2

iconv('UTF-8', 'ASCII//TRANSLIT', 'é@ùµ$`à');
// "e@uu$`a"

iconv('UTF-8', 'ASCII//IGNORE', 'é@ùµ$`à');
// "@$`"

iconv('UTF-8', 'ASCII//TRANSLIT//IGNORE', 'é@ùµ$`à');
// "e@uu$`a"

PHP 7.4

iconv('UTF-8', 'ASCII//TRANSLIT', 'é@ùµ$`à');
// PHP Notice:  iconv(): Detected an illegal character

iconv('UTF-8', 'ASCII//IGNORE', 'é@ùµ$`à');
// "@$`"

iconv('UTF-8', 'ASCII//TRANSLIT//IGNORE', 'é@ùµ$`à');
// "e@u$`a"

iconv('UTF-8', 'ASCII//TRANSLIT//IGNORE', Transliterator::create('Any-Latin; NFD; [:Nonspacing Mark:] Remove; NFC')->transliterate('é@ùµ$`à'))
// "e@uu$`a" -> same as PHP 7.2
atmacola
  • 354
  • 2
  • 10
2

I found one Solution :

echo iconv('UTF-8', 'ASCII//TRANSLIT', utf8_encode($string));

use utf8_encode()

Irshad Khan
  • 5,670
  • 2
  • 44
  • 39
  • `utf8_encode` is obsolete in PHP 8.2. https://www.php.net/manual/function.utf8-encode.php – tcit Nov 09 '22 at 07:52
1

this bellow solution worked for me

$result_encr="##Sƒ";

iconv("cp1252", "utf-8//IGNORE", $result_encr);
1

I had the same error with files with "é" characters in ASCII format generated by Notepad++

$row_tmp[$index]=iconv("ASCII", "UTF-8//TRANSLIT",$value);

Using

$row_tmp[$index]=iconv("ISO-8859-1", "UTF-8//TRANSLIT",$value);

...seems to solve it (using a Windows computer with Belgian locale). iconv apparently needs the exact code of the ASCII extension for the source encoding. mb_detect_encoding returns "ASCII" for the same file, so be careful if the source encoding parameter comes from a variable...