1
#include<stdio.h>
#include<stdlib.h>

static char mem[4];

int main()
{
    int* A = (int*)(mem);

    mem[0] = 0;
    mem[1] = 1;
    mem[2] = 2;
    mem[3] = 3;

    int i;
    A[0] = 5;

    for (i = 0; i<4; i++)
    {
        printf("%d\t", mem[i]);
    }

    printf("\n");
    return 0;
}

My question is, why does this print 5 0 0 0 rather than 5 1 2 3? Why is the array "wiped out?"

machine_1
  • 4,266
  • 2
  • 21
  • 42
Aperson123
  • 129
  • 6

4 Answers4

6

in your case

A[0] = 5;

writes 5 as integer which has a bigger size than char (2,4,8, depends).

Your system seems to be little endian, so 5 is written in the first char location, then zeroes.

note that if sizeof(int) is 8 (as it can happen on some systems), your code is unsafe and triggers undefined behaviour as it overwrites memory past the mem array (not to mention possible misalignment issues that may slow down operatons or even crash on some processors)

Which is why we must respect the strict aliasing rule to avoid "lying to the compiler", for instance, create an union so the compiler can adjust alignment & check sizes.

This other Q&A is related: What happens when you cast a char * address to int * in C when the address is not word-aligned?

Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
  • 1
    I am curious; I normally don't write code like this. Is this a violation of the strict aliasing rule? – machine_1 Feb 15 '18 at 19:55
  • @machine_1: It *looks* like a violation, but C and C++ have an *exception* to strict aliasing for `char*` because otherwise reading and writing from disk files or sending over network sockets, and other byte-oriented things like that would not work. – Zan Lynx Feb 15 '18 at 20:12
  • 4
    @ZanLynx - It doesn't look like one, it *is* one. You can use a `char*` to access any object type. You may not use an `int*` to access an object declared as `char[4]`. It's not the same. – StoryTeller - Unslander Monica Feb 15 '18 at 20:14
  • @StoryTeller: Of course you can. How else do you think an `fread` into a char buffer works when you then access it via a struct pointer? – Zan Lynx Feb 15 '18 at 20:21
  • 2
    @ZanLynx - I call it undefined behavior. An implementation is allowed to make it defined, but it's not ISO C that makes it okay – StoryTeller - Unslander Monica Feb 15 '18 at 20:27
  • @StoryTeller: Look at section 6.5.7 in http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf Note that access by "a character type". Also above that bit is this quote "If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type[...]" – Zan Lynx Feb 15 '18 at 20:38
  • 2
    @ZanLynx - Access by a character type, and the object *having* a character (array) type, is not the same thing. The OP does access via an `int` type when they write 5. – StoryTeller - Unslander Monica Feb 15 '18 at 20:41
  • @StoryTeller But when the code writes bytes into the character array, that is exactly the same as copying an array of character type. Then using that character array as an integer is perfectly fine. – Zan Lynx Feb 15 '18 at 20:46
  • @ZanLynx `fread()` does not specify use of a `char*` buffer or a character array. – chux - Reinstate Monica Feb 15 '18 at 20:50
  • 1
    @ZanLynx - The C abstract machine doesn't say an access "writes bytes". 6.5.7 is quite clear in what types of access are valid and which are not. Read it carefully, and try not to project an implementation onto the standards wording. I'm not about to spend more time mulling this. Read the standard, I have nothing more to say. – StoryTeller - Unslander Monica Feb 15 '18 at 20:50
  • 2
    @ZanLynx: `A[0]=5` does not “write bytes into the character array.” In the terminology of the C standard, it access the objects of the array of `char` through an lvalue of type `int`, and this is expressly prohibited by [C 2011 (N1570) 6.5 7](https://stackoverflow.com/a/48800036/298225). – Eric Postpischil Feb 15 '18 at 21:00
2

Because in your system sizeof int is atleast 4 times that of sizeof(char) so it took 4 bytes to overwrite all that you wrote because A is a pointer to int.(But it can take larger than that) (This is the case that would typically be) In case it is not it is an undefined behavior.

Also note in mind that how we write it is depending upon the endianness. In your case it is little endian. So based on endianness it might differ also. No matter what this has no more than experimental purpose.

Also to give you an idea why it is UB when sizeof(int) is greater than 4 - is then you would access via A[0] some meory out of this array memory and try to modify it and make changes to it - leading to undefined behavior.

user2736738
  • 30,591
  • 5
  • 42
  • 56
2

Your code exhibits undefined behavior, as you try to alias char * type using an int pointer. It is a violation of the strict aliasing rule, so your code is simply erroneous.

A char * type may alias other pointer types, but not the other way around. So, make sure not to violate the language's constraints.

machine_1
  • 4,266
  • 2
  • 21
  • 42
0

int* A = (int*)(mem); --> risks undefined behavior per "If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined." C11dr §6.3.2.3 7

So the rest of code is moot.

Use a union to cope with alignment and anti-aliasing concerns.

#include <stdio.h>

int main(void) {
  union {
    int i;
    unsigned char mem[sizeof(int)];
  } u;

  for (unsigned i = 0; i < sizeof(int); i++) {
    u.mem[i] = i;
    printf("%hhu\t", u.mem[i]);
  }
  printf("\n");

  u.i = 5;

  for (unsigned i = 0; i < sizeof(int); i++) {
    printf("%hhu\t", u.mem[i]);
  }
  printf("\n");
  return 0;
}

Output (may vary on your machine)

0   1   2   3   
5   0   0   0

Why: Now that undefined behavior UB is removed from the code, Jean-François Fabre good answer explains the differences.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256