php curl response encoding

Question

I am trying to process data that I got using curl, but I have issues with encoding - I am unable to find right way to handle it.

This is the text I got (in HEX) - '6B 64 6F 20 6D C3 A1' that should evaluate to string 'kdo má' but instead of it, it evaluates to 'kdo m??' (actually, the last two chars aren't question marks but http://www.fileformat.info/info/unicode/char/c3/index.htm and http://www.fileformat.info/info/unicode/char/a1/index.htm)

I don't understand why some chars are 8bit and diacritic chars are 16 bit and how should PHP know which one is which, but anyway, how should I decode it?

_“I don't understand why some chars are 8bit and diacritic chars are 16 bit”_ – because that’s how a [variable-width encoding](http://en.wikipedia.org/wiki/Variable-width_encoding) works … — CBroe, Sep 16 '13 at 21:53
You're probably getting UTF-8 text, which uses "high" ascii for the extended code sequences (lower 7bits of UTF-8 correspond 1:1 with US-ASCII). But you're probably dumping that UTF text into a different charset's environment, where the UTF-8 hibit escapes have no meaning, e.g. iso-8859. — Marc B, Sep 16 '13 at 21:54

score 0 · Answer 1 · answered Sep 16 '13 at 21:53

0

don't understand why some chars are 8bit and diacritic chars are 16 bit

Most likely because it's UTF8 or perhaps even UTF16. And by default PHP assumes one character == one byte

and how should PHP know which one is which, but anyway, how should I decode it?

No. You have to tell it. Check mbstring: http://php.net/manual/de/book.mbstring.php or recode: http://php.net/manual/en/book.recode.php

answered Sep 16 '13 at 21:53

Marcin Orlowski

72,056
11
123
141

I tried everything I found. utf8_encode/decode, mbstring functions, iconv ... but nothing helped. – user10099 Sep 16 '13 at 22:30

php curl response encoding

1 Answers1

Linked