1

I have one question, I have an array of unsigned chars and I want to print them but When I use cout a different value is printed.

The array contains values in binary format, no with 0 and 1 but whit the representation witch chars.

Array to print:

[0] 147 unsigned char
[1] 252 'ü' unsigned char
[2] 170 'ª' unsigned char
[3] 194 'Â' unsigned char
[4] 108 'l' unsigned char

What I got:

ô
ⁿ
¬
┬
l

when i write it into a file, the representation is correct because I open the ofstream in binay mode but I cannot find how to do the same in cout.

Best Regards

user3799835
  • 111
  • 1
  • 4
  • 5
    Let me guess. You're on Windows. The console is using [CP-437](http://en.wikipedia.org/wiki/Code_page_437) and the program you use to read the file is using [Windows-1252](http://en.wikipedia.org/wiki/Windows-1252). Both of those characters have the same numerical representation, but are being interpreted differently. – chris Aug 05 '14 at 21:19
  • @chris There's even an [SO answer](http://stackoverflow.com/a/4882624/1413395) for this. – πάντα ῥεῖ Aug 05 '14 at 21:36
  • I very much disagree that [the linked duplicate question](http://stackoverflow.com/questions/19562103/uint8-t-cant-be-printed-with-cout) also answers this question. The linked question is about unprintable ASCII characters, while this one is about the _extended_ ASCII character set. The linked question has nothing to do with code pages, while that is exactly the problem here. Unfortunately, I don't have enough rep to vote to reopen for this reason. – Bart van Nierop Aug 06 '14 at 09:04

1 Answers1

1

You are the victim of code pages.

Back in the olden days all we had to care about were plain English characters. For these characters we defined the ASCII character set. It defines 128 characters (0 to 127 decimal) with printable characters starting at 32 (space). Conveniently, the ASCII character set fits in 7 bits.

By using the unused 8th bit, it was possible to do something with values 128 to 255. Problem is, lots of people thought about that at the same time, so there were many different ways in which systems would display these top 128 characters, for many different languages.

Eventually all of this got standardized. The first 128 characters are the same everywhere, but the the characters from 128 and up are defined in code pages.

Like chris and πάντα ῥεῖ commented, the Windows console uses a different code page (code page 437) than most of the rest of Windows, which uses Windows-1252.

In conclusion, your output is correct in the sense that the decimal values of the input and output are the same, but it looks wrong because you're viewing them in different code pages.

Community
  • 1
  • 1
Bart van Nierop
  • 4,130
  • 2
  • 28
  • 32