Extended ASCII question

Question

I read wikipedia but I do not understand whether extended ASCII is still just ASCII and is available on any computer that would run my console application? Also if I understand it correctly, I can write an ASCII char only by using its unicode code in VB or C#. Thank you

What kind of characters are you planning on using in your console application? Graphics characters? Non-English characters? — bzlm, Oct 17 '10 at 12:30
This was well covered in his previous question: http://stackoverflow.com/questions/3948089/console-write-display-extended-ascii-chars — Hans Passant, Oct 17 '10 at 12:33
Basically the smiles and arrows and also old box drawing ones that are in the extended ascii. — Loj, Oct 17 '10 at 12:34
@Mojmir Read the answers you've gotten. "Extended ASCII" doesn't exist, and ASCII doesn't have smiles and arrows. Actually, it seems like the answer Hans was kind enough to give you answers this question perfectly, doesn't it? http://stackoverflow.com/questions/3948089/console-write-display-extended-ascii-chars/3948982#3948982 — bzlm, Oct 17 '10 at 12:40
@bzlm: "smile" is in the standard ASCII, I think its start of text. — Loj, Oct 17 '10 at 12:41
@Mojmir Still not reading the answers I see. :) If you're trying to make a console application with a funky old-school GUI, there other better ways, like using something like the .NET wrapper for new Curses: http://maureenblack.net/?p=23 — bzlm, Oct 17 '10 at 12:43
@bzlm: I do. But it only says there is no extended ASCII, ok. But in the ASCII set, there are chars displayed as graphical symbols - what about u0002? In console app it prints the "smile" — Loj, Oct 17 '10 at 12:47
The console is not using ASCII. It's using the “OEM code page”, probably [code page 437](http://en.wikipedia.org/wiki/Code_page_437), for legacy DOS reasons. Almost no other tools use this code page; you can't use character 0x02 and expect to get a smiley face in a text editor. ASCII 0x02 is an invisible control code. Instead you would need Unicode U+263A White Smiling Face, `☺`. — bobince, Oct 17 '10 at 13:28
@bobince: Seems like Console.Write and u0002 or u263a makes the game glyph which differs by the color. The first one is black. Still do not get why I can use u263a and not u0002. EDIT: I can both display in Notepad though.. — Loj, Oct 17 '10 at 13:38
[Here is character U+0002](http://www.fileformat.info/info/unicode/char/0002/index.htm). As you can see, it is an invisible control character and not a smiley face. It doesn't display as a smiley in Notepad for me, or anywhere else except in the Windows console due to that tool's re-use of control codes for non-standard graphics characters. Use [U+263B](http://www.fileformat.info/info/unicode/char/263b/index.htm) for the other smiley. — bobince, Oct 17 '10 at 15:18
@bobince Gosh I am confused more than ever :) If I look up the ASCII, the smily face is listed as 02 (Start of text). I am not arguing, I would like to understand it. If the first 127 chars of ASCII are the same in unicode, I think u0002 is correct — Loj, Oct 17 '10 at 16:22
Where are you “looking up ASCII”? Because according to the actual ASCII standard, character 0x02 is the rarely-used control code for ‘start of text’, not a smiley or any other visible character. The smiley in that position is purely a DOS OEM code page thing and not part of ASCII. — bobince, Oct 17 '10 at 21:44

score 3 · Accepted Answer · edited Oct 08 '14 at 17:04

3

ASCII only covers the characters with value 0-127, and those are the same on all computers. (Well, almost, although this is mostly a matter of glyphs rather than semantics.)

Extended ASCII is a term for various single-byte code pages that are assign various characters to the range 128-255. There is no single "extended ASCII" set of characters.

In C# and VB.NET, all strings are Unicode, so by default, there's no need to worry about this - whether or not a character can be displated in a console app is a matter of the fonts being used, not the limitation of any specific single-byte codepage.

edited Oct 08 '14 at 17:04

Community

1
1

answered Oct 17 '10 at 12:27

Michael Madsen

54,231
8
72
83

You don't write software that runs on EBCDIC systems?! :P – Oct 17 '10 at 12:29
@Roger: No, no I don't. And I don't think the OP will do that either :) (Also, thanks.) – Michael Madsen Oct 17 '10 at 12:35
Thanks, also if I use only the first 127, I can be sure they will be displayed well, right? – Loj Oct 17 '10 at 12:37
@Mojmir: Assuming you don't use any non-printable characters, and we ignore the issue about the glyph used for a backslash on a Japanese or Korean system, then yes. – Michael Madsen Oct 17 '10 at 12:41
Come on everyone, let's get him down below 10k again so he doesn't get too cocky. :) – bzlm Oct 17 '10 at 12:42
@bzlm: He'd need posts worth downvoting for that. When I saw he was just shy of 10k I looked through his answers, and didn't see any meriting that (but I was looking for ones worth upvoting instead ;). – Oct 17 '10 at 12:45
@Michael Madsen, and when using non printable chars? I just do not know what is bad about it. If I make simple app and wants to display some of the ASCII standard non printable chars. – Loj Oct 17 '10 at 12:53
@Mojmir It's hard to understand what your question is. Are you asking whether what you see in your console output will work for any user anywhere regardless of which ASCII character you use? – bzlm Oct 17 '10 at 12:59
1

Well, they're *non-printable*, therefore, you can't really count on anything sensible happening if you try to print them. If you're thinking of the glyphs you could usually show for those in the old DOS days, there are equivalent Unicode characters for those, but depending on the console font, you may not be able to display all of them - you'll have to try for yourself. See [Wikipedia's page on code page 850](http://en.wikipedia.org/wiki/Code_page_850) for an example of this mapping. – Michael Madsen Oct 17 '10 at 13:03
@bzlm, yes, basically. If I use only the first 127 chars - uc0002 etc. – Loj Oct 17 '10 at 13:05
@ Michael Madsen Thanks. So, if I am able to display say this uc0002 (smile) in the console app, then (as C# .NET uses Unicode) everyone will. I was only confused whether this char is ascii or not – Loj Oct 17 '10 at 13:31
@Mojmir: No, Unicode character 0002 is not a smile. Look at the hexadecimal number below it; *that's* the Unicode value of that character. – Michael Madsen Oct 17 '10 at 14:19
@Michael Madsen Well, so what is the reason \u0002 prints that smile? I do not get it :( – Loj Oct 17 '10 at 16:23
@Mojmir: Because that's how those really old code pages were defined, so the glyphs are in that particular font for legacy reasons - it's not something you should depend on, because it's officially a [control character with no formally defined glyph](http://www.fileformat.info/info/unicode/char/0002/index.htm). Use the proper smile glyph from Unicode if you really need it. – Michael Madsen Oct 17 '10 at 16:32

score 3 · Answer 2 · edited Oct 17 '10 at 12:29

As others have said, true ASCII is always the lower 7 bits of each byte. Before the advent (and ubiquity) of Unicode standards, various extensions to the ASCII character set that utilized the eighth bit were released. The most common in the Windows world is Windows code page 1252.

If you're looking to use this encoding in .NET, you can get it like this:

Encoding windows1252 = Encoding.GetEncoding("windows-1252");

score 1 · Answer 3 · 2010-10-17T12:28:08.173

As Wikipedia says, ASCII is only 0-127. "Extended ASCII" is a misnomer, should be avoided, and used to loosely mean "some other character set based on ASCII which only uses single bytes" (meaning not multibyte like UTF-8). Sometimes the term means the 128-255 codepoints of that specific character set⁠—⁠but again, it's vague and you shouldn't count on it meaning anything specific.

The use of the term is sometimes criticized, because it can be mistakenly interpreted that the ASCII standard has been updated to include more than 128 characters or that the term unambiguously identifies a single encoding, both of which are untrue.

Source: http://en.wikipedia.org/wiki/Extended_ASCII

Extended ASCII question

3 Answers3