If we have US7ASCII characterset why does it let us store non-ascii characters?

Question

We have an Oracle database that has the NLS_CHARACTERSET = US7ASCII.

As a test, we ran an insert into a table that contains a VARCHAR(4000) field as were able to put a CHR(176) value in that column (degrees symbol).

That character doesn't appear to be supported by US7ASCII.

Why would the database allow that value to be stored in that column?

There must be something else going on because `US7ASCII` is a 7-bit encoding that doesn't support ['DEGREE SIGN' (U+00B0)](http://www.fileformat.info/info/unicode/char/00b0/index.htm) so you should be getting a literal `?` question mark on return (or maybe another replacement character). You can google for Globalization Support Guide, which explains this pretty well. Strange. — Álvaro González, Oct 31 '17 at 16:21
[Character Set Conversion in a Monolingual Scenario](https://docs.oracle.com/cd/B19306_01/server.102/b14225/ch2charset.htm#sthref165): "When a target character set does not contain all of the characters in the source data, replacement characters are used. If, for example, a server uses US7ASCII and a German client uses WE8ISO8859P1, then the German character `ß` is replaced with `?` and `ä` is replaced with `a`." — Álvaro González, Oct 31 '17 at 16:25
See this answer: https://stackoverflow.com/questions/36710360/difference-between-nls-nchar-characterset-abd-nls-characterset-for-oracle/36712457#36712457 — Wernfried Domscheit, Oct 31 '17 at 16:55
When viewed through Oracle SQL developer, it appears as an upright rectangle character. When viewed through Aqua Data studio, it appears as a degrees symbol. — Matthew Walk, Oct 31 '17 at 16:56

score 2 · Answer 1 · answered Nov 01 '17 at 08:47

It works because following conditions are both true:

Client character set is equal to your database character set.
The character set permits any byte values

Your database character set and your client character set are set to US7ASCII. In such case each data is written/read one by one without any conversion, i.e. the bytes you send are exactly written to database. Probably you did not set NLS_LANG at all on your client side but Oracle defaults it to AMERICAN_AMERICA.US7ASCII.

US7ASCII is a 7-bit encoding. I assume a pure ASCII application (which could be fairly difficult to find) would just ignore the 8th bit which is stored in an 8-Bit architecture. Other character sets, e.g. AL32UTF8 do not allow each byte value. In this case such characters will be replaced by a placeholder, e.g. ¿ or ?.

Note, you set your client character set to US7ASCII which is most likely not correct. Set it properly to the character set which is used by your application, then ° will get replaced.

In case you use SQL*Plus check console codepage with command chcp, resp. locale charmap. Set your NLS_LANG environment variable accordingly before you start sqlplus.

score 0 · Answer 2 · edited Oct 31 '17 at 23:02

0

The database allows that value to be stored because Oracle will handle the character set conversion for you.

More info can be found here: Special Characters in Oracle

edited Oct 31 '17 at 23:02

Remy Lebeau

555,201
31
458
770

answered Oct 31 '17 at 16:08

1991DBA

805
1
9
18

That's right but, shouldn't he be getting `?` rather than `°`? – Álvaro González Oct 31 '17 at 16:18
Doesn't that change depending on the character set of the operating system? I was under the belief that what gets displayed depends on the language/character set of the OS. – 1991DBA Oct 31 '17 at 16:37
1

Perhaps that's the last fallback layer if you don't set the connection encoding anywhere (I don't know). But if the character cannot be physically stored on the server, there's no way for Oracle to figure out what character it was meant to be in the first place. – Álvaro González Oct 31 '17 at 16:45

If we have US7ASCII characterset why does it let us store non-ascii characters?

2 Answers2

Linked