0

Probably a programmer read up on the fundamentals of encoding, but my specific question is

In windows Visual C output to Windows console, the unsigned char value 140 prints as unicode 0x0152 = 338.

What encoding (or "code page"?) is Visual C using? (e.g. UTF-7, ASCII). How exactly is unsigned char value 140 mapping to the unicode character 338?

I'm sure anyone can paste a link to one of the many wikipedia pages on encoding or a tables on the web, but a more specific answer to the question would be nice.

T. Webster
  • 9,605
  • 6
  • 67
  • 94

1 Answers1

0

That would be encoding Windows-1252, and it's not that Visual C is "using" it, it's what the Windows console is interpreting the output as.

This other answer might be what you seek: What encoding/code page is cmd.exe using?

"Code page" is Microsoft's term for "Coded Character Set" which is more or less equivalent to what you think of as an "encoding", but see the Unicode Glossary for precise definitions.

And to quote the Wikipedia page:

Historically, the phrase "ANSI code page" (ACP) is used in Windows to refer to various code pages considered as native. The intention was that most of these would be ANSI standards such as ISO-8859-1. Even though Windows-1252 was the first and by far most popular code page named so in Microsoft Windows parlance, the code page has never been an ANSI standard. Microsoft-affiliated bloggers now state that "The term ANSI as used to signify Windows code pages is a historical reference, but is nowadays a misnomer that continues to persist in the Windows community."

Community
  • 1
  • 1
Christoffer Hammarström
  • 27,242
  • 4
  • 49
  • 58