This has absolutely nothing to do with UTF-8 encoding. Forget about that part entirely. utf8_decode
doesn't do anything in your code. iconv
is entirely unrelated.
It has to do with PHP string literal interpretation. The \...
in "\151\163\142\156"
is a special PHP string literal escape sequence:
\[0-7]{1,3}
the sequence of characters matching the regular expression is a character in octal notation, which silently overflows to fit in a byte (e.g. "\400" === "\000")
http://php.net/manual/en/language.types.string.php#language.types.string.syntax.double
Which very easily explains why it works when written in a PHP string literal, and doesn't work when reading from an outside source (because the external text read through file_get_contents
is not being interpreted as PHP code). Simply do echo "\151\163\142\156"
and you'll see "isbn" without any other conversions necessary.
To manually convert the individual escape sequences in the string \151\163\142\156
to their character equivalents (really: their byte equivalents):
$string = '\151\163\142\156'; // note: single quotes cause no iterpretation
echo preg_replace_callback('/\\\\([0-7]{1,3})/', function ($m) {
return chr(octdec($m[1]));
}, $string)
// isbn
stripcslashes
happens to include this functionality, but it also does a whole lot of other things which may be undesired.
The other way around:
$string = 'isbn';
preg_replace_callback('/./', function ($m) {
return '\\' . decoct(ord($m[0]));
}, $string)
// \151\163\142\156