35

In the following code, *(long*)0=0; is used along with the if clause, but what is its purpose?

if(r.wid*r.ht < tot)
    *(long*)0=0;
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
yatendra
  • 317
  • 4
  • 11
  • 1
    This statement should have been a macro with a descriptive name. (It's one of the few good reasons for a macro. You want the crash to happen in the correct function so the stack frame is meaningful) – MSalters Jan 27 '14 at 11:05
  • C++ or C? I don't see any C++ features, but the standards may differ. – 11684 Jan 27 '14 at 13:51
  • Have you actually tried this? Last time I did, my compiler simply left the whole statement out. Perfectly legal according to the C standard. – ntoskrnl Jan 27 '14 at 14:43
  • 2
    A modern compiler might remove that whole piece of code. The `then` part is undefined behaviour and thus the compiler can assume it's unreachable. Assuming the `if` expression has no side effect, the whole `if` statement can be removed. – CodesInChaos Jan 27 '14 at 14:45
  • Redis uses a similar trick to simulate a seg fault and then you can pick up from there in the debugger, see [What does “*((char*)-1) = 'x';” code mean?](http://stackoverflow.com/questions/20844863/what-does-char-1-x-code-mean). As I said in my answer to the linked question it is undefined behavior and I would not be surprised if this was optimized out in some cases. the previous thread [What is the simplest standard conform way to produce a Segfault in C?](http://stackoverflow.com/questions/18986351/what-is-the-simplest-standard-conform-way-to-produce-a-segfault-in-c) is also apropos. – Shafik Yaghmour Jan 28 '14 at 14:29
  • @CodesInChaos indeed it can be optimized away, John Regehr has some good articles on the topic for example [Finding Undefined Behavior Bugs by Finding Dead Code](http://blog.regehr.org/archives/970). – Shafik Yaghmour Jan 28 '14 at 14:45
  • @yatendra depends on the platform. – Etherealone Jan 28 '14 at 19:33

4 Answers4

58

It writes 0 to 0 interpreted as the address of a long, i.e. the NULL pointer. It's not a valid thing to be doing, since NULL is never an address at which you can validly have data that your program can access. This code triggers undefined behavior; you cannot rely on it to have any particular effect, in general.

However, often code like this is used to force a segmentation fault-type crash, which is sometimes handy to drop into a debugger.

Again, this is undefined behavior; there is no guarantee that it will cause such a fault, but on systems that have segmentation faults, the above code is pretty likely to generate one. On other systems it might do something completely different.

If you get a segfault, it's sometimes more convenient to trigger one this way than by manually setting a breakpoint in the debugger. For instance if you're not using an IDE, it's often easier to type those few tokens into the code in the desired place, than it is to give the (textual) command to the debugger, specifying the exact source code file and line number manually can be a bit annoying.

unwind
  • 391,730
  • 64
  • 469
  • 606
  • Can you explain more please? I still don't understand why this is useful. If you need a debugger - what about breakpoints? – VP. Jan 27 '14 at 09:03
  • 5
    Why is it guaranteed that this code will cause a segmentation fault? – Maroun Jan 27 '14 at 09:03
  • @ ᴍaroun ᴍaroun because the C standard doesnt state writing to address 0x0 as undefined behavior. It is well defined. It is jsut saying it has to rise an exception. – dhein Jan 27 '14 at 09:04
  • 15
    Is there a reason to favor this over `assert`? – user694733 Jan 27 '14 at 09:08
  • 3
    @Zaibis The C99 draft says "If an invalid value has been assigned to the pointer, the behavior of the unary `*` operator is undefined", and clarifies this to include `NULL` pointers. – unwind Jan 27 '14 at 09:14
  • 2
    @user694733 It's simpler, i.e. it does less. An `assert()` can be compiled out, and it calls `abort()` which is not the same thing as getting a segfault at the site of the call. – unwind Jan 27 '14 at 09:16
  • Note that it's UNDEFINED, not guaranteed to do anything in particular. So, if you run this on DOS, it will overwrite the vector for the divide by zero trap handler, rather than "stop" in any meaningful way. But on machines that have memory management enabled, it is likely to cause a crash of some sort. I personally prefer to make the address and/or value written a little more recognisable, so `*(long*)42=123454678;` - that way you know it's not some random NULL pointer access. – Mats Petersson Jan 27 '14 at 09:19
  • 12
    Although @unwind's answer is perfectly correct for most general purpose machines (and he correctly points out that it is undefined behavior, and not guaranteed), such code does find a place in embedded systems, where there really is something (memory mapped IO, or dedicated memory addresses) at the address 0 (or whatever address the null pointer maps to). – James Kanze Jan 27 '14 at 09:23
  • @VictorPolevoy You do not need to run program under debugger. This statement causes segmentation fault and the image of your program is stored into file `core`. You can then examine values of variables at the moment of crash with debuger. Also if working on a big project in production `assert` is often redefined to do nothing. I think that you do something like this if you have no other good choice or defining your own `assert`. – Marian Jan 27 '14 at 09:30
  • @unwind but isn't `assert(0)` kept as a debug trap? – ratchet freak Jan 27 '14 at 09:32
  • 1
    I'm sure this answer would benefit a lot from the code in question rewritten into a more explicit form such as `long* ptr = (long*)0; *ptr=0;` or something. – sharptooth Jan 27 '14 at 09:56
  • in fact many embedded systems, esp. Harvard architecture ones, allow to use 0 as a valid address – phuclv Jan 27 '14 at 10:18
  • 1
    @LưuVĩnhPhúc Yes, sure. Most systems without an MMU will be unable to detect the error, also. I mentioned that a great number of times, that the behavior is *undefined*. – unwind Jan 27 '14 at 10:23
  • 1
    @user694733 I can't see the value over `assert` either. Yes, an assertion can be compiled out, but it really _should be_ compiled out in release. The condition which triggers the assert can demonstrably not happen if the program is error-free _because the write-to-nullpointer hack will cause the program to crash_ (and, other than an assert, with a non-intellegible message!). So if it has any chance of firing, the program is broken and cannot be released. Setting a breakpoint takes seconds, rebuilding takes minutes, insofar I don't see the added value there either. But each to their own :) – Damon Jan 27 '14 at 11:11
  • @MatsPetersson - with modern compiler it might be not even likely with MMU as compiler can assume invalid pointer dereference will never happen. So it's perfectly legal for it to optimize it out if it knows that 0 address is illegal. – Maciej Piechotka Jan 27 '14 at 12:46
  • 3
    @Damon In the gigantic codebases I tend to work on, we tend to prefer to leave as many assertions *enabled* as possible in release builds, because it's effectively impossible to be *sure* you can't hit them via some code path that has, perchance, gotten missed despite the best efforts of QA and test development. – zwol Jan 27 '14 at 14:53
  • 1
    We leave assertions in our codebase too, and have customized the handler of them so that it doesn't just make the process keel over, it also provides some reasonably relevant debugging information and asks the user to report it as a high-priority bug. Funnily enough, we hardly ever get assertion failures these days… – Donal Fellows Jan 27 '14 at 15:47
  • note: (void*)0 does NOT have to be a null pointer... it depends on the exact architecture, there's a good discussion about it here on SO –  Jan 27 '14 at 15:49
  • Technically speaking, it's not undefined behavior in C. C11 at 6.3.2.3.3 says: _An integer constant expression with the value 0, or such an expression cast to type `void *`, is called a null pointer constant._ But here, the value 0 is _not_ cast to `void *` but to `long *`. So while `long *x = (void *)0` would yield a NULL pointer, `long *x = (long *)0` doesn't necessarily have to. This is actually quite important in microcontrollers where address 0 is addressable and there is nothing wrong with that. – Shahbaz Jan 27 '14 at 16:27
  • @vaxquis, see my previous commit. By definition of standard, `(void *)0` is the NULL pointer. – Shahbaz Jan 27 '14 at 16:28
  • @LuchianGrigore: Yes null pointers are always represented by a literal 0 or a 0 value cast to a pointer type. Even if the machine the program is running on uses a different value on the metal, within C the null pointer is always represented by the value 0. – datenwolf Jan 27 '14 at 16:30
  • @datenwolf exactly, representation != actual value. – Luchian Grigore Jan 27 '14 at 16:38
  • @Shahbaz As I said elsewhere: no, `(long *)0` is a null pointer just as much as `(long *)(void *)0` is, because `0` all by itself is *already* a null pointer constant. For the microcontroller situation, where you are stuck with valid memory at address zero, the least-unportable way to go is declare an external symbol and then use a linker script to point that symbol to address zero. Or hack the compiler to apply a one-page offset to all addresses other than NULL itself. – zwol Jan 27 '14 at 16:43
  • http://stackoverflow.com/questions/9894013/is-null-always-zero-in-c - strictly speaking, nullptr isn't zero (since it can't be evaluated numerically), NULL doesn't have to be == 0, so NO, (void*) 0 is not *bound* to be NULL, see "NULL which expands to an implementation-defined null pointer constant The address of the NULL pointer might be different from 0, while it will behave like it was in most cases." sorry, mates, but only thing that IS defined is that "An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant" –  Jan 27 '14 at 17:49
  • @vaxquis `NULL doesn't have to be == 0` – yes it has. See http://c-faq.com/null/ptrtest.html – datenwolf Jan 27 '14 at 18:11
  • *sigh again* you only proven that the compiler has to *treat* zero as a null value and vice versa, not that zero has to be a null pointer value. please, before posting data READ IT beforehand; quote from your link "There is no trickery involved here; compilers do work this way, and generate identical code for both constructs. The internal representation of a null pointer does not matter." i.e. internally null pointer doesn't have to be == 0, it just have to be evaluated as == by the compiler, WHICH IS WHAT I ALREADY STATED. so please READ BEFORE YOU QUOTE. –  Jan 27 '14 at 18:27
  • nb that's exactly why i++ does NOT necessarily indicate that i has been increased by 1. *what if i is a pointer to a 32-bit int?* i has increased by 4... but stil i++ has the same effect as i+=1 which *still doesn't increase i by 1* - the same's with using '0' literal as a null value - it doesn't mean that 0 **is** null, that is that null value actually points to 0x0 - not to mention NULL is only a macro you can easily redefine or simply NOT INCLUDE in your code - **now you get it?** that's exactly why I quoted both C++ spec & SO discussion about it. –  Jan 27 '14 at 18:29
  • nb, that's when people assume things about what you said without reading it first: I never said that (void*)0 can't be *used* as a null pointer, I just said that (void*)0 is not a null pointer per se, that is that it *doesn't have to cause a segfault* - you can easily cast it to any other type of pointer and then deref. it on any arch. mapping 0x0 validly - since the casting IS possible, which can be a PITA to debug - it only has to be *interpreted* by the compiler as a null pointer due to legacy code compatibility reasons - that's *exactly* why C++11 introduces nullptr –  Jan 27 '14 at 18:39
18

In textbook C, abort is the way to deliberately crash the program. However, when you're programming close to the metal, you might have to worry about the possibility of abort not working as intended! The standard POSIXy implementation of abort calls getpid and kill (via raise) to deliver SIGABRT to the process, which in turn may cause execution of a signal handler, which can do as it likes. There are situations, e.g. deep in the guts of malloc, in response to catastrophic, possibly-adversarial memory corruption, where you need to force a crash without touching the stack at all (specifically, without executing a return instruction, which might jump to malicious code). *(long *)0 = 0 is not the craziest thing to try in those circumstances. It does still risk executing a signal handler, but that's unavoidable; there is no way to trigger SIGKILL without making a function call. More seriously (IMHO) modern compilers are a little too likely to see that, observe that it has undefined behavior, delete it, and delete the test as well, because the test can't possibly ever be true, because no one would deliberately invoke undefined behavior, would they? If this kind of logic seems perverse, please read the LLVM group's discourse on undefined behavior and optimization (part 2, part 3).

There are better ways to achieve this goal. Many compilers nowadays have an intrinsic (e.g. gcc, clang: __builtin_trap()) that generates a machine instruction that is guaranteed to cause a hardware fault and delivery of SIGILL; unlike undefined tricks with pointers, the compiler won't optimize that out. If your compiler doesn't have that, but does have assembly inserts, you can manually insert such an instruction—this is probably low-level enough code that the additional bit of machine dependence isn't a big deal. Or, you could just call _exit. This is arguably the safest way to play it, because it doesn't risk running signal handlers, and it involves no function returns even internally. But it does mean you don't get a core dump.

zwol
  • 135,547
  • 38
  • 252
  • 361
  • To be completely standard, see [my common on another answer](https://stackoverflow.com/questions/21376602/what-is-the-function-of-this-statement-long0-0#comment32254516_21376628). The standard way of crashing with `*(long *)0 = 0` would actually be `*(long *)(void *)0 = 0`. – Shahbaz Jan 27 '14 at 16:30
  • 2
    @Shahbaz Sorry, no, the intermediate `void *` makes no difference whatsoever. You are mistaken on two counts: first, `0` all by itself is a valid null pointer constant, therefore `(long *)0` and `(long *)(void *)0` are *both* null pointers. Second, dereferencing a null pointer provokes undefined behavior no matter how you constructed the null pointer. – zwol Jan 27 '14 at 16:39
  • 0 is a valid NULL pointer if it's not cast to any pointer type. Otherwise the standard didn't have to single out the cast to `void *`. Let's take an imaginary compiler where NULL has the value `0xFFFF`. If the compiler sees 0 assigned to a pointer, it will use `0xFFFF` instead because that's the NULL pointer. If it sees `(void *)0` it would also use `0xFFFF` because that's the NULL pointer. But it should use the exact value of `0x0000` in `(long *)0` because that's _not_ the NULL pointer. – Shahbaz Jan 27 '14 at 16:43
  • @Shahbaz I see where you got that impression, but no, that is not how the language works. Converting an integer constant expression whose value is zero (it does not have to be literally `0`) to **any** pointer type, by **any** means, always produces the null pointer of that type (technically each type could use a different bit representation). I don't have the standard on this computer, so I can't quote it at you, but if you have a copy, reread the description of the semantics of cast expressions very carefully. (1/2) – zwol Jan 27 '14 at 16:46
  • @Shahbaz The reason for the "or such an expression cast to `void *`" language in the standard is that `(void *)expr` is not an integer constant expression, so it would normally not be allowed in constructs where an IC-expr is required (e.g. some kinds of static initializers). The committee wanted to make sure you could use the `NULL` macro anywhere an IC-expr was allowed, even if `` happened to define it as `((void *)0)`. (2/2) – zwol Jan 27 '14 at 16:50
  • 2
    @Shahbaz The standard agrees with Zack: "An integer constant expression with the value 0, **or** such an expression cast to type void *, is called a null pointer constant." Then (parens mine): "If a null pointer constant (`0`, as above) is converted to a pointer type (`long*`), the resulting pointer, called a null pointer..." -- N1570 §6.3.2.3 – tab Jan 27 '14 at 16:50
  • @Zack, the standard could have easily said cast to a pointer type instead of `void *`. @tab, that sounds convincing. However, I'm still not quite sure then what's the purpose of the phrase that says cast to `void *`, since what you say covers that already. I've asked [a question](http://stackoverflow.com/q/21386995/912144) you may want to answer more completely to! – Shahbaz Jan 27 '14 at 17:05
6

To cause a program to 'exit abnormally', use the abort() function (http://pubs.opengroup.org/onlinepubs/9699919799/functions/abort.html).

The standard C/C++ idiom for "if condition X is not true, make the program exit abnormally" is the assert() macro. The code above would be better written:

assert( !(r.wid*r.ht < tot) );

or (if you're happy to ignore edge cases), it reads more cleanly as:

assert( r.wid*r.ht >= tot );
Ian Harvey
  • 229
  • 1
  • 5
3

If width times height of r is less than total, crash the program.

cybermage14
  • 168
  • 1
  • 9
    It's an answer; it's just not as useful an answer as you'd like. But that's no reason to say it's a non-answer. – Alice Jan 27 '14 at 13:45
  • That's the indent of the program, but not necessarily what it actually does. – CodesInChaos Jan 27 '14 at 14:46
  • 2
    The OP asked what was the purpose, the only way to know this is by guessing what the intention of the original author was - I think this answer sums that up quite nicely. – paulm Jan 27 '14 at 18:57