0

I'm aware that in C you may write beyond the end of allocated memory, and that instead of crashing this just leads to undefined behaviour, but somehow after testing many times, even with loops, and other variables, the output is always exactly as expected.

Specifically, I've been writing to an integer beyond the bounds of malloc(1), as such.

int *x = malloc(1);
*x = 123456789;

It's small enough to fit in 4 bytes (my compiler warns me that it will overflow it it's too large, which makes sense), but still clearly larger than one byte, however it still somehow works. I haven't been able to run a single test that didn't either work in a very "defined"-looking manner, or segfault immediately. Such tests include repeatedly recompiling and running the program, and outputting the value of x, trying to write over it with a giant array, and trying to write over it with an array of length 0, going beyond its boundaries.

After seeing this, I immediately went and tried to edit a string literal, which should be read-only. But somehow, it worked, and seemed consistent also.

Can someone recommend a test I may use to demonstrate undefined behaviour? Is my compiler (Mingw64 on Windows 10) somehow doing something to make up for my perceived stupidity? Where are the nasal demons?

tree
  • 21
  • 5
    Undefined Behavior means it might behave right and it might not. – Paul Ogilvie Feb 03 '20 at 10:17
  • That's the *beauty* of Undefined Behaviour. Learn to avoid UB [at all costs]. Do not rely on compiler or program output. – pmg Feb 03 '20 at 10:17
  • 1
    Probably malloc gave you a bit more than one byte, because that could have been easier for malloc. On another platform or compiler, it might not work like that. – Paul Ogilvie Feb 03 '20 at 10:19
  • @PaulOgilvie Yeah, I know, and this situation won't ever occur in real life, but I'm hear to learn, and I want to know why it keeps working. – tree Feb 03 '20 at 10:27
  • The nasals demons are not guaranteed. It's not even certain that *anyone* has experienced them. The behaviour is undefined, just like when you leave the parking brake off. The car might roll away, it might not, or something else unexepected might happen. No-one can devise a reliable demonstration of how to experience nasal demons, because that would be a defined situation, and it is not. But if you want to damage something, you could perhaps do it more easily by breaking the bounds of a local array, that is probably on the stack. – Weather Vane Feb 03 '20 at 10:30
  • 1
    Duplicate: [How to explain undefined behavior to know-it-all newbies?](https://stackoverflow.com/questions/2235457/how-to-explain-undefined-behavior-to-know-it-all-newbies). As for how to make the code crash, make sure that you write, read and print. Desktop OS don't actually allocate or use heap memory until it is used. – Lundin Feb 03 '20 at 10:47
  • Instead of `*x = 123456789;` try `x[100000] = 123456789;` – dbush Feb 06 '20 at 20:09
  • 1
    "Where are the nasal demons?" -- they are trying to lure you into dangerous waters: don't listen to their siren song.... – ad absurdum Feb 06 '20 at 20:36

2 Answers2

2

The term "Undefined Behavior" embodies two different concepts: actions whose behavior isn't specified by anything, and actions whose behavior isn't specified by the C Standard, but is specified by many implementations. While some people, including the maintainers of some compilers, refuse to acknowledge the existence of the second category, the authors of the Standard described it explicitly:

Undefined behavior gives the implementor license not to catch certain program errors that are difficult to diagnose. It also identifies areas of possible conforming language extension: the implementor may augment the language by providing a definition of the officially undefined behavior.

On most implementations, your program would be example of the first kind. Implementations will typically, for their own convenience, pad small allocation requests up to a certain minimum size, and will also pad larger allocation requests if needed to make them be a multiple of a certain size. They generally do not document this behavior, however. Your code should only be expected to behave meaningfully on an implementation which documents the behavior of malloc in sufficient detail to guarantee that the requisite amount of space will be available; on such an implementation, your code would invoke UB of the second type.

Many kinds of tasks would be impossible or impractical without exploiting the second kind of UB, but such exploitation generally requires disabling certain compiler optimizations and diagnostic features. I can't think of any reason why code that wanted space for 4 bytes would only malloc one, unless it was designed to test the behavior of an allocator which would use the storage immediately past the end of an allocation for a particular documented purpose.

supercat
  • 77,689
  • 9
  • 166
  • 211
2

One of the trademarks of undefined behavior is that the same code can behave differently on different compilers or with different compiler settings.

Given this code:

#include <stdio.h>
#include <stdlib.h>

int main()
{
    int *x = malloc(1);
    x[100000] = 123456789;
    return 0;
}

If I compile this on my local machine with -O0 and run it, the code segfaults. If I compile with -O3, it doesn't.

[dbush@centos72 ~]$ gcc -O0 -Wall -Wextra -o x1 x1.c 
[dbush@centos72 ~]$ ./x1
Segmentation fault (core dumped)
[dbush@centos72 ~]$ gcc -O3 -Wall -Wextra -o x1 x1.c 
[dbush@centos72 ~]$ ./x1
[dbush@centos72 ~]$ 

Of course, this is just on my machine. Yours may do something entirely different.

dbush
  • 205,898
  • 23
  • 218
  • 273
  • 1
    Other trademarks include behaving erratically when a seemingly unrelated line of code is added, moved, or removed ;) – ad absurdum Feb 06 '20 at 20:39