Strange behaviour when casting array char to array short

Question

I have 2 array: the first one is 8 unsigned char, and the second one is 4 unsigned short, for some algorithm compatibility issue i need to use the short array with the values of the char array, to do so i'm doing a loop

j = 0;
for(i=0; i<8; i+=2)
{
    short_array[j] = *(unsigned short*) (char_array + i);
    j++;
}

Everything work fine here, but in some previous attempt to build this up, i've tried the following (this is obviously not the correct answer)

j = 0;
for(i=0; i<8; i+=2)
{
    short_array[j] = (unsigned short*) *(&(char_array + i));
    j++;
}

QUESTION:

Assuming the following char_array = {0x11,0x22,0x33,0x44,0x55,0x66,0x77,0x88} When I do the first one short_array = {0x1122, 0x3344, 0x5566, 0x7788} But when I do the second one, short_array = {0x3344, 0x5566, 0x7788, ???} (where ??? is undefined since it is a value in the memory and may change).

Can you explain why this i happening?

PS: My compiler suite is C251 from Keil

@H2CO3 I'm very agreed with that! this is why i'm again really surprised, does this compile on gcc (or any C99)? — Jaay, Jun 12 '13 at 08:27
No, it doesn't. `clang` on OS X tells me `quirk.c:14:44: error: address expression must be an lvalue or a function designator`. — , Jun 12 '13 at 08:29
So I think this is a compiler issue (thank you for testing it), I will take a look on the assembly code to understand that. If a Keil guy read this, please consider making a efficient compiler... — Jaay, Jun 12 '13 at 08:42
`memcpy(short_array, char_array, sizeof short_array);` might work, depending on the endianess. — wildplasser, Jun 12 '13 at 10:13

score 0 · Accepted Answer · answered Jun 12 '13 at 10:48

0

I've compared the 2 solutions (even if the second is not a working one), and this is definitly a compiler issue, first of all the compiler shouldn't let me compile &(char_array + i) , then I've find out (when inspecting the assembly code) that the char_array is incremented before accessing it in the loop for the second case; and the increment is made after reading the variable in the first case (the correct one). There seem to be no more explanation than a compiler implementation problem...

answered Jun 12 '13 at 10:48

Jaay

2,103
1
16
34

You are assuming that your first code block is correct. It isn't, since is depends on the alignment of the char array and the endianness of the target platform. Blaming the compiler for accepting incorrect code will not help. You should start by writing correct code. – wildplasser Jun 12 '13 at 11:31
Like I said to It'sPete, we are perfectly sure of the alignment and of the endianess of our target; I checked with my local expert about that and said that it is the best solution for this particular case; once again I'm agree that the endianess and alignment is very important with arrays. The question was more about why does the second option compile and generate strange behaviour – Jaay Jun 12 '13 at 11:41

It'sPete · Answer 2 · 2013-06-12T08:53:04.297

-1

I would suggest to manually read out each element in your char array and build each pair of chars into a short. I know that there are probably faster methods, but you are implicitly stepping over the bounds of the type you have specific in the char array otherwise. Also, I would claim that my version is slightly easier to read in terms of your intent, albeit more kludgy.

That is, something like this:

IGNORE THIS FIRST CODE BLOCK... THIS IS WHAT HAPPENS WHEN I DON'T SLEEP!

j=0;
uint16_t temp;  // recommend to use these types for portability
for(i=0; i<8; i+=2) {
  temp = 0x0000;
  temp += (uint16_t)(char_array[i] << 8);
  temp += (uint16_t)(char_array[i+1]);
  short_array[j] = temp;
  j++;
}

Also, here's a faster method I came up with after thinking about the problem a bit more, and giving myself a well-deserved self-imposed code review.

j=0;
  for(i=0; i<8; i+=2) {
  short_array[j] = (char_array[i] << 8) | char_array[i+1];  // use bitwise operations
  j++;
}

edited Jun 12 '13 at 08:53

answered Jun 12 '13 at 08:48

It'sPete

5,083
8
39
72

"implicitly stepping over the bounds of the type", what does that mean? My first code work perfectly this is not the problem – Jaay Jun 12 '13 at 08:51
The way I look at it, when you create a char array, you've made a claim in your code that you are going to access each byte at a time. By reading out two bytes at a time, you've effectively destroyed the barrier that you defined by stating that the array was divided by chars. This isn't a compiler issue, but more of a readability and style issue. – It'sPete Jun 12 '13 at 08:56
This is much more efficient by doing a short cast on the array since char_values are stored next to each others, but your answer is good too – Jaay Jun 12 '13 at 09:01
1

I disagree. By using the cast you are assuming that everything in memory is aligned correctly. Using the bitwise operators gets you around that issue. See this: http://stackoverflow.com/questions/300808/c-how-to-cast-2-bytes-in-an-array-to-an-unsigned-short – It'sPete Jun 12 '13 at 09:18
You are right again, but I'm not working on a PC code, it is embedded and we have a perfect control of what we are writting in memory; in this case char_array is located in NVM and we make sure there is no alignment problem, but you are absolutely correct – Jaay Jun 12 '13 at 09:23

Strange behaviour when casting array char to array short

2 Answers2