-2

I am creating an int array and then tricking c into believing that it's an array of short values. I know it's not good practice but I am just trying to understand why this isn't working. Shouldn't this change the value of arr[3] ?

#include <stdio.h>

int main() {
    printf("Hello, World!\n");
    int arr[5];
    arr[0] = 0; arr[1] = 0; arr[2] = 0; arr[4] = 0;
    arr[3] = 128;
    ((short*)arr)[6] = 128; // Shouldn't this change arr[3] ? as 6,7 indices in the arr of short would compromise of arr[3] in arr of ints?
    int i = 0;
    for (i = 0; i < 5; i++){
        printf("%d\n", arr[i]);
    }
    return 0;
}

PS: Here's a deeper clarification: When I cast int array to a short array, it seemingly becomes an array of 10 short elements (not 5). So when I change arr[6], I am changing only the first 16 bits of the int arr[3]. So arr[3] should still change and it is NOT that I am changing it to 128 again and not seeing the change.

FOR CLARIFICATION: THIS CODE IS ONLY FOR EXPERIMENTAL REASONS! I AM JUST LEARNING HOW POINTERS WORK AND I GET THAT ITS NOT GOOD PRACTICE.

Manan Mehta
  • 5,501
  • 1
  • 18
  • 18
  • 2
    How would you know it doesn't since both assignments set it to the same value (128)? Wouldn't it make sense for the second assignment to at least use a different value so that you can see any change? – kaylum Mar 30 '17 at 21:58
  • 2
    Where is it stated that a `short` is half the size of an `int`? – PaulMcKenzie Mar 30 '17 at 21:59
  • 2
    C++ Undefined Behaviour – Richard Critten Mar 30 '17 at 22:00
  • @kaylum Nope, that's not true.. When I cast it to a short array, it seemingly becomes an array of 10 short elements. When I change the 6 one, I am changing only half of arr[3]. Check out https://www.youtube.com/watch?v=H4MQXBF6FN4&index=3&list=PL9D558D49CA734A02 at 27.50 to see what I mean. – Manan Mehta Mar 30 '17 at 22:04
  • @MananMehta What's not true? I'm just pointing out your test code can be improved. See the second comment for a hint on what your real problem may be. – kaylum Mar 30 '17 at 22:05
  • @PaulMcKenzie I printed sizeof(short) which was two bytes and sizeof(int) which is four bytes – Manan Mehta Mar 30 '17 at 22:07
  • @kaylum Sorry, what I meant was that I would know if it worked since even though I am setting it to the same value 128, I am assigning 128 to the first 16 bits of the actual int when I do arr[6] = 128. The second set of 16 bits remain 128 (from when I do arr[3] = 128). – Manan Mehta Mar 30 '17 at 22:11
  • It is not only "not good practice", but invokes undefined behaviour for violating effective type (aka strict aliasing) rule. The compiler is free to do anything, including formatting your disc. – too honest for this site Mar 30 '17 at 23:25
  • @MananMehta The sizeof() fundamental types, besides `sizeof(char)` are not set in stone. Run that code against another compiler where [`sizeof(short)` is not half of `sizeof(int)`](http://stackoverflow.com/questions/20109984/c-c-sizeofshort-sizeofint-sizeoflong-sizeoflong-long-etc-on-a), and you get differing results. – PaulMcKenzie Mar 31 '17 at 03:39
  • Compilers are not required to recognize any particular behavior of the code, but quality compilers suitable for systems programming can be configured to do so (if they don't be default). You didn't specify your platform or compiler, so it's impossible to say what yours would require. Also, btw, some platforms have `int` as a 16-bit type, the same size as `short`, and on those implementations your cross-type aliasing would access non-existent element `arr[6]`. – supercat Mar 31 '17 at 16:24
  • Even absent size considerations of `int` vs `short`, another consideration that will affect your results is padding. As many CPUs access 'non-aligned' data slowly, or not at all, many compilers will add padding bytes to structures so elements will fall on the proper address boundaries for faster access. If you didn't want that behavior, you often have to invoke some sort of pragma to turn it off. – infixed Apr 06 '17 at 22:02

3 Answers3

2

Your code has undefined behavior, because you are writing a datum with a declared type through a pointer to a different type, and the different type is not char.

int arr[5];
/* ... */
((short*)arr)[6] = /* ANYTHING */;

The compiler is entitled to generate machine code that doesn't include the write to ((short*)arr)[6] at all, and this is quite likely with modern compilers. It's also entitled to delete the entire body of main on the theory that all possible executions of the program provoke undefined behavior, therefore the program will never actually be run.

(Some people say that when you write a program with undefined behavior, the C compiler is entitled to make demons fly out of your nose, but as a retired compiler developer I can assure you that most C compilers can't actually do that.)

zwol
  • 135,547
  • 38
  • 252
  • 361
  • Compilers are not required to recognize cross-type aliasing, but quality compilers suitable for system programming will be able to do so (perhaps via a configuration option). In cases where there would be some genuine advantage to exploiting cross-type aliasing, it is generally reasonable for a programmer to specify that it must be used on an implementation that would support it rather than try to work around the possible lack of such support. – supercat Mar 31 '17 at 16:23
  • @supercat _Must_ you bring up your beef with modern C compilers in general in _every_ C answer that touches on undefined behavior? I would take it as a personal favor if you would at least refrain from doing so in the comments to my answers, in the future. – zwol Mar 31 '17 at 20:15
0

It is changing arr[3], however you are setting it back to 128 so you arent noticing a change. Change the line to:

((short*)arr)[6] = 72;

and you should see the following output:

enter image description here

Also a couple of things to clean up if you are new to C. You can initialize an array to zero by doing the following.

... int arr[5] = { 0 }; arr[3] = 128; ...

Hope this helps!

marclave
  • 107
  • 1
  • 6
  • Nope, this doesn't solve it. When I cast it to a short array, it seemingly becomes an array of 10 short elements. When I change the 6 one, I am changing only half of arr[3]. Check out https://www.youtube.com/watch?v=H4MQXBF6FN4&index=3&list=PL9D558D49CA734A02 at 27.50 to see what I mean. – Manan Mehta Mar 30 '17 at 22:03
  • 1) Don't post images if text. 2) Anyithing can occur. – too honest for this site Mar 30 '17 at 23:26
0

Have you considered endianness?

EDIT: Now to add more clarity ...

As others have mentioned in the comments, this is most definitely undefined behavior! This is not just "not good practice", it's just don't do it!

Pointers on C is an excellent book that goes over everything you wanted to know about pointers and more. It's dated but still very relevant. You can probably find most of the information online, but I haven't seen many books that deal with pointers as completely as this one.

Though it sounds like you are experimenting, possibly as part of a class. So, here are a number of things wrong with this code:

  • endianness
  • memory access model
  • assumption of type size
  • assumption of hardware architecture
  • cross type casting

Remember, even though C is considered a pretty low level language today, it is still a high level programming language that affords many key abstractions.

Now, look at your declaration again.

int arr[5]; 

You've allocated 5 ints grouped together and accessed via a common variable named arr. By the standard, the array is 5 elements of at least 2 bytes per element with base address of &arr[0]. So, you aren't guaranteed that an int is 2 bytes, or 4 bytes or whatever. Likewise, as short is defined by the standard as at least 2 bytes. However, a short is not an int even if they have the same byte width! Remember, C is strongly typed.

Now, it looks like you are running on a machine where shorts are 2 bytes and ints are 4 bytes. That is where the endianness issue come into play: where is your most significant bit? And where is your most significant byte?

By casting the address of arr to a short pointer first of all breaks both the type and the memory access model. Then, you want to access the 6th element from the offset of arr. However, you aren't accessing relative to the int you declared arr to be, you are accessing through a short pointer that is pointing at the same address as arr!

These following operations ARE NOT the same! And it also falls into the category of undefined - don't do this ever!

int foo;
int pfooInt;
short bar;
short * pfooShort;

bar = (short) foo;
pfooShort = (short*)&foo;
pfooInt = &foo;
bar = *pfooShort;
pfooShort = (short*)pfooInt[0];

Another thing to clarify for you:

int arr[5];

((short *)arr)[6] ...

This does not transform your int array of 5 elements into a short array with 10 elements. arr is still an int array of 5 elements. You just broke the access method and are trying to modify memory in an undefined manner. What you did is tell the compiler "ignore what I told you about arr previously, treat arr as a short pointer for the life of this statement and access/modify 6th short relative to this pointer."

B Keenan
  • 101
  • 8
  • You're a genius, mate! That fixed it! – Manan Mehta Mar 30 '17 at 22:22
  • For sure, it isn't good practice but this is just an attempt to understand what c does under the hood. It's taught in Stanford's CS107 course. Take it up with them if you find it stupid. – Manan Mehta Mar 30 '17 at 23:29
  • 1
    @Olaf: A compiler would only be free to do anything if it does not document the behavior of cross-type aliasing. Quality compilers suitable for systems programming will document such behavior, at least when invoked with certain configuration options. – supercat Mar 31 '17 at 16:27
  • @supercat: This is not documented by any compiler completely, as it would restrict optimisations unnecessarily. The standard does intentionally not require this. And without a specific implementation and all information there is no use in exploring what actualy happens. You might have noticed the question does not specify any of those, leaving the standard's defined behaviour, which is "undefined". Programmers relying on such code to work are asking for trouble. – too honest for this site Mar 31 '17 at 17:11
  • @Olaf: What is the purpose and effect the `-fno-strict-aliasing` flag on gcc and clang? For some application fields, the value of cross-type aliasing exceeds the opportunity cost of foregone optimizations, especially in cases where the "optimizations" would be guaranteed to make code useless. There is value in allowing compilers to assume that a `short*` won't affect an `int`in cases where there would be no *particular* reason to believe that it might do so. In cases like this, however, it would likely be easier for a compiler to recognize that e.g.... – supercat Mar 31 '17 at 18:07
  • `*(uint16_t*)(someUint32Array[x]) = 0x8000;` may affect `someUint32[x]` than for it to recognize that `someUint32Array[x] = (someUint32Array[x] & 0xFFFF0000) | 0x8000;` doesn't need to fetch a value, mask it, set bit 15, and write it back, but could instead just set the low word to 0x8000. In cases where an object's address is cast and the dereferenced lvalue is accessed in the same expression, I see no plausible optimization opportunities which would need to be foregone to recognize the aliasing. – supercat Mar 31 '17 at 18:16
  • @supercat: The effective type rule has more implications than just aliasing. – too honest for this site Mar 31 '17 at 18:35
  • @Olaf: What is the purpose of `-fno-strict-aliasing`? – supercat Mar 31 '17 at 19:07
  • 1
    I just wanted to add that that pointing out endianness issues with this code did not "fix" it. As @Olaf and others point out. The code is not standard, and undefined. Take this as an opportunity to learn what not to do, and why it is bad practice to write undefined code. In this case, the code modified bytes that you were not expecting to be modified, which is never a good thing. – B Keenan Apr 12 '17 at 13:26