5

Working my way through a C tutorial

#include <stdio.h>

int main() {
  short s = 10;
  int i = *(int *)&s; // wonder about this
  printf("%i", i);
  return 0;
}

When I tell C that the address of s is an int, should it not read 4 bytes?

Starting from the left most side of 2 bytes of s. In which case is this not critically dangerous as I don't know what it is reading since the short only assigned 2 bytes?

Should this not crash for trying to access memory that I haven't assigned/belong-to-me?

Kir Chou
  • 2,980
  • 1
  • 36
  • 48
Mâtt Frëëman
  • 1,361
  • 9
  • 20

6 Answers6

5
  1. Don't do that ever
  2. Throw away the tutorial if it teaches/preaches that.

As you pointed out it will read more bytes than that were actually allocated, so it reads off some garbage value from the memory not allocate by your variable.

In fact it is dangerous and it breaks the Strict Aliasing Rule[Detail below] and causes an Undefined Behavior.
The compiler should give you a warning like this.

warning: dereferencing type-punned pointer will break strict-aliasing rules

And you should always listen to your compiler when it cries out that warning.


[Detail]

Strict aliasing is an assumption, made by the C (or C++) compiler, that dereferencing pointers to objects of different types will never refer to the same memory location (i.e. alias each other.)

The exception to the rule is a char*, which is allowed to point to any type.

Community
  • 1
  • 1
Alok Save
  • 202,538
  • 53
  • 430
  • 533
3

First of all, never do this.

As to why it doesn't crash: since s is a local, it's allocated on the stack. If short and int have different sizes in your architecture (which is not a given), then you will probably end up reading a few more bytes from memory that's on the same memory page as the stack; so and there will be no access violation (even though you will read garbage).

Probably.

Jon
  • 428,835
  • 81
  • 738
  • 806
1

This is dangerous and undefined behaviour, just as you said.

The reason why it doesn't crash on 32 (or 64) bit platforms is that most compilers allocate atleast 32 bits for each stack variable. This makes the access faster, but on e.g. 8 bit processor you would get garbage data in the upper bits instead.

jpa
  • 10,351
  • 1
  • 28
  • 45
1

No it's not going to crash your program, however it is going to be reading a portion of other variables (or possibly garbage) on the stack. I don't know what tutorial you got this from, but that kind of code is scary.

Chris Eberle
  • 47,994
  • 12
  • 82
  • 119
1

First of all, all addresses are of the same size and if you're in a 64bit architecture, each char *, short * or int * will have 8 bytes. When using a star before an ampersand it will cancel the effect, so *&x is semantically equivalent to just x.

Henrique Rocha
  • 1,737
  • 1
  • 19
  • 29
  • It is *not* guaranteed that all pointer types have the same size. It may be the case on specific architectures, but you cannot rely on it being universally true. Never make that assumption. – John Bode Sep 18 '11 at 18:34
  • Some particular example where funcion pointer differ in size from other pointers in [Are there are any platforms where pointers to different types have different sizes?](http://stackoverflow.com/q/916051/2509), though the standard allow for worse cases...the only guarantee is that `sizeof (void*)` is big enough to hold any other non-function pointer type. – dmckee --- ex-moderator kitten Sep 18 '11 at 19:12
  • Thats seem strange to give a short and char 8 bytes? Is this universal , and a what level? presumably compiler? – Mâtt Frëëman Sep 19 '11 at 01:32
  • @wtfcoder You're not giving 8 bytes to both short and char. A pointer contains a memory address, and an address size is (should be?) independent of the type that it contains. – Henrique Rocha Sep 19 '11 at 09:44
1

Basically you are right in the sense that since you are accessing an int * pointer, this will fetch 4 bytes instead of the only 2 reserved for 's' storage and the resulting content won't be a perfect reflection of what 's' really means.

However this most likely won't crash since 's' is located on the stack so depending on how your stack is laid out at this point, you will most likely read data pushed during the 'main' function prologue...

See for a program to crash due to invalid read memory access, you need to access a memory region that is not mapped which will trigger a 'segmentation fault' at the userworld level while a 'page fault' at the kernel level. By 'mapped' I mean you have a known mapping between a virtual memory region and a physical memory region (such mapping is handled by the operating system). That is why if you access a NULL pointer you will get such exception because there is no valid mapping at the userworld level. A valid mapping will usually be given to you by calling something like malloc() (note that malloc() is not a syscall but a smart wrapper around that manages your virtual memory blocks). Your stack is no exception since it is just memory like anything else but some pre-mapped area is already done for you so that when you create a local variable in a block you don't have to worry about its memory location since that's handled for you and in this case you are not accessing far enough to reach something non-mapped.

Now let's say you do something like that:

short s = 10;
int *i = (int *)&s;
*i = -1;

Then in this case your program is more likely to crash since in this case you start overwriting data. Depending on the data you are touching the effect of this might range from harmless program misbehavior to a program crash if for instance you overwrite the return address pushed in the stack... Data corruption is to me one of the hardest (if not the hardest) bugs category to deal with since its effect can affect your system randomly with non-deterministic pattern and might happen long after the original offending instructions were actually executed.

If you want to understand more about internal memory management, you probably want to look into Virtual Memory Management in Operating System designs.

Hope it helps,