2

I'm having trouble to understand how multibyte character are represented in the ascii table : decimal format and then in hexadecimal.

For instance:

char *c = "é";
printf("%d\n%d", c[0], c[1]);

It will display :

-61

-87

In the ascii table, "é" in decimal is 130, and 82 in hex. I understand 82 is the hexadecimal value of 130, but how can we obtain 130 from -61 & -87 ?

Thanks in advance and sorry for my spelling

Techie
  • 44,706
  • 42
  • 157
  • 243
inScienta
  • 23
  • 4
  • what happens when you cast your chars as `unsigned int`s and use `ud` as the `printf` modifier? Also note that `c[1]` is obviously going to show your `'\0'` – im so confused Oct 18 '12 at 15:33
  • Error: Cast from pointer to integer. "é" must in a char *, can't be contained in a char, therefore an int either I suppose. – inScienta Oct 18 '12 at 19:27

1 Answers1

3

According to UTF-8 charset (used, among other, by many GNU/Linux distributions), the value of 'é' character constant is 0xC3A9, which is equivalent to 11000011 10010101 in binary. Here we can understand the results, assuming two complement representation.

  • The sequence 11000011 is equal to -61 in decimal.
  • The sequence 10010101 is equal to -87 in decimal.
md5
  • 23,373
  • 3
  • 44
  • 93