Odd C++ Character Array Reference

Question

I have inherited the following code and would like to know more about the indexing used.

...
char cpChMap[256];
memset(cpChMap, 0xff, 256);
for (i = 0; i < 10; i++)
    cpChMap['0' + i] = 0;
...

I have never seen a char used to index and array before [cpChMap[<character expression>]], can someone explain how this works or supply a definitive reference as I can't seem to find a decent reference and I have been searching for a while.

Thanks for your time.

['0'+n](http://stackoverflow.com/questions/1114741/how-to-convert-int-to-char-c) and [n-'0'](http://stackoverflow.com/questions/5029840/convert-char-to-int-in-c-and-c) are often used to convert `int` to `char` and `char` to `int`. — Joe, Jul 10 '14 at 13:43
possible duplicate of [Why are C character literals ints instead of chars?](http://stackoverflow.com/questions/433895/why-are-c-character-literals-ints-instead-of-chars) — Deduplicator, Jul 10 '14 at 13:43
So the compiler performs and implicit conversion. But, how does it know to use the ASCII character set to do so? I am having a binary formatting issue and I think this could be the root cause of it... — MoonKnight, Jul 10 '14 at 13:45
Ps. I don;t think this is a duplicate. I would like to know how this indexing works at the implementation level... — MoonKnight, Jul 10 '14 at 13:46
The compiler is configured for a specific system. Each character literal corresponds to a specific integer number *on that system*. May well be EBCDIC. I.e. the same source code will translate -- not suprisingly -- to different binaries on different systems. — Peter - Reinstate Monica, Jul 10 '14 at 14:01
The code that this is extracted from is doing bit wise mapping. I am translating the old C code to C++11. I am getting a miss-match in some of the put put and I believe it is down to some encoding issue that I don't fully understand yet. Thanks. — MoonKnight, Jul 10 '14 at 14:04

Yu Hao · Accepted Answer · 2014-07-10T13:56:04.870

4

char is an integral type. They can be used in arithmetic expressions:

std::cout << '0' + 7 << std::endl;

The fact that '0' through '9' are contiguous implies that '0' + 7 has the same value of '7', that's why in the loop:

for (i = 0; i < 10; i++)
    cpChMap['0' + i] = 0;

cpChMap uses index '0'(the same as 48, assuming ASCII) to '9'.

edited Jul 10 '14 at 13:56

answered Jul 10 '14 at 13:49

Yu Hao

119,891
44
235
294

1

*are continuous implies* - I believe you were looking for the word contiguous. – Joe Jul 10 '14 at 13:52
1

@Joe Yeah, English is not my native language, sometimes the difference of those words confuses me. – Yu Hao Jul 10 '14 at 13:57

score 1 · Answer 2 · answered Jul 10 '14 at 13:56

1

A char is also an int8 type. (integer with size of 8 bits = 1 byte)

Each char is equivalent to its ASCII value.

It is not answering the question, but I don't understand why they didn't do this:

char cpChMap[256];
memset(cpChMap, 0xff, 256);
memset(&cpChMap['0'], 0, 10);

answered Jul 10 '14 at 13:56

SHR

7,940
9
38
57

Fair point. I don't know either. This code is full of bad code in my opinion, a full rewrite would be nice but would be way too expensive! Thanks for the answer... – MoonKnight Jul 10 '14 at 14:02
1

A char is equivalent to its ASCII value only on an ASCII system. Certain assumptions may not hold on others: '0' == 48 or 'S'-'R' == 1 and such. – Peter - Reinstate Monica Jul 10 '14 at 14:03
1

Typically so, but there are platforms where byte (i. e., the minimal *independently-addressable* unit of memory) is 32 bit. `char` always has the size of one byte; therefore on such platforms `char` is 32-bit. – ach Jul 10 '14 at 14:07
@AndreyChernyakhovskiy the code won't work if the system is not ASCII. for 32 bit you'll need more then 256 bytes... As far as I know, byte is always 8 bit. – SHR Jul 10 '14 at 14:12
@SHR, the code *will* work if the system is not ASCII. The code won't work if the system has non-octet bytes. For most of our work, it is quite safe to assume that a byte has 8 bits. That is true for all general-purpose platforms and it seems impossible that things might change in the future. Still, there *are* some exotic platforms (some embedded micro-controller chips) where bytes are non-octet. Anyway, it is not nice practice to use magic values such as 256 and 0xff in your code; better use constants. – ach Jul 10 '14 at 18:24
@AndreyChernyakhovskiy Is it ok to assume the digits numbers are ordered from '0' to '9' and not starting for example from '1' (look on your keyboard...)? Is it ok to assume that all the digits are in sequential order? maybe there is some exotic platform which the digits ordered differently? Anyway I didn't thought the question was about some weird os. – SHR Jul 10 '14 at 20:51
@SHR, yes, both C and C++ standards have contained the next clause in all versions: "In both the source and execution basic character sets, the value of each character after 0 in the above list of decimal digits shall be one greater than the value of the previous." From practical point of view, the only non-ASCII-compatible character set that used to be fairly popular in the early days of C is EBCDIC and it is compliant with this requirement. – ach Jul 11 '14 at 08:42

Odd C++ Character Array Reference

2 Answers2