4

It seems like the C/C++ compiler (clang, gcc, etc) produces different output related to the optimization level. You may as well check the online link included in this post.

http://cpp.sh/5vrmv (change output from none to -O3 to see the differences).

Based on the following piece of code, could someone explain a few questions I have:

#include <stdio.h>
#include <stdlib.h>

int main(void) {

    int *p = (int *)malloc(sizeof(int));
    free(p);
    int *q = (int *)malloc(sizeof(int));
    if (p == q) {
        *p = 10;
        *q = 14;
        printf("%d", *p);
    }
    return 0;
}
  1. Is it certain that the execution will always get into the if statement? How do we know the addresses of the two pointers, p and q, will be the same?
  2. Why does no-optimization has output 14, while -O3 has output 10 for the same instructions?
Christian Hackl
  • 27,051
  • 3
  • 32
  • 62
Dave5545
  • 129
  • 5
  • 3
    At least here `*p = 10;` you have undefined behavior. And no, it's not guaranteed that the freed address is reused by `malloc()`. – πάντα ῥεῖ Mar 02 '16 at 10:46
  • Right. Well, ok. It is undefined behavior, but why is it consistent between the two different optimization levels? – Dave5545 Mar 02 '16 at 10:51
  • 4
    _"but why is it consistent between the two different optimization levels? "_ Undefined behavior means **anything can happen**, including your observations. – πάντα ῥεῖ Mar 02 '16 at 10:55
  • Nothing "undefined" about `*p = 10` -- `p` is equal to `q` at this point. – Mikhail T. Mar 02 '16 at 10:59
  • 1
    Please, *do not* cast the result of `malloc` (nor `calloc`) in C-code. The functions return `void *`, which requires no casting in the language. C++ is different, but in C this is redundant and irksome. – Mikhail T. Mar 02 '16 at 11:02
  • 1
    @MikhailT. Whether or not `p==q` is irrelevant; `p` is no longer a valid pointer and the compiler is allowed to assume for optimization purposes that you do not use it in an undefined manner. – TartanLlama Mar 02 '16 at 11:05
  • 1
    I've re-added the C++ tag originally used by the author because the code is legal C++, differences between C and C++ may in theory be relevant, and the C++ standard has a lot to say about this topic. – Christian Hackl Mar 02 '16 at 11:05
  • TartanLLama, if `p` is the same as `q`, then the two pointers are valid or invalid at the same time. So `p==q` is quite relevant. The assumptions you are referring to are invalid. – Mikhail T. Mar 02 '16 at 11:08
  • P.S.: And I'm giving the author the benefit of doubt that he knows that these are two different languages :) – Christian Hackl Mar 02 '16 at 11:08
  • 3
    @MikhailT.: The result of `==` is implementation-defined. Nothing stops an implementation from returning `true` and then still treating the two as different. – Christian Hackl Mar 02 '16 at 11:09
  • 2
    @MikhailT. There's even a note in the standard which explicitly addresses this: `[basic.stc.dynamic.safety]:` "[Note: the effect of using an invalid pointer value (including passing it to a deallocation function) is undefined. This is true even if the unsafely-derived pointer value might compare equal to some safely-derived pointer value. —end note ]" – TartanLlama Mar 02 '16 at 11:14
  • @TartanLlama : very interesting. (yes, I'm aware these are two different languages, but their compilers behave in the same way, every time). Note that we are not deallocating a dangling pointer, we are deferencing it with the indirection operator. – Dave5545 Mar 02 '16 at 11:23
  • Doesn't `if (p == q)` itself invoke UB? I mean, you access `p` which currently points to a memory location you aren't supposed to tamper with. NOTE: I'm talking about C here. – Spikatrix Mar 02 '16 at 11:30
  • Please see: https://stackoverflow.com/questions/26704344/why-does-misra-c-state-that-a-copy-of-pointers-can-cause-a-memory-exception/26704433#26704433 In that case the pointer is copied, not compared, but that doesn't matter because in both cases the object's value is accessed. The conclusion is the same, undefined behavior. – 2501 Mar 02 '16 at 11:31
  • @2501 My reading of that passage is that the lifetime of the pointee has ended, so referring to that would be undefined behaviour, but the pointer merely has an indeterminate value, which isn't necessarily undefined behaviour to access. Is that not correct? – TartanLlama Mar 02 '16 at 11:36
  • 2
    @TartanLlama No, because an indeterminate value may be a trap representation and accessing it is undefined behavior. (This is true for C) – 2501 Mar 02 '16 at 11:43
  • @2501 Got it, thanks! – TartanLlama Mar 02 '16 at 11:44
  • @CoolGuy It does. @ TartanLlama No problem, note that my comment is only true for C, I don't know what is the situation in C++. – 2501 Mar 02 '16 at 11:47

3 Answers3

4
free(p);

This turns the contents of p into an invalid pointer value.

int *q = (int *)malloc(sizeof(int));

This line has no relevance for p.

if (p == q) {

This is implementation-defined behaviour because p has an invalid pointer value.

    *p = 10;

And finally, this is undefined behaviour, for the same reason as above.

C++ standard §3.7.4.2/4:

If the argument given to a deallocation function in the standard library is a pointer that is not the null pointer value (4.10), the deallocation function shall deallocate the storage referenced by the pointer, rendering invalid all pointers referring to any part of the deallocated storage. Indirection through an invalid pointer value and passing an invalid pointer value to a deallocation function have undefined behavior. Any other use of an invalid pointer value has implementation-defined behavior.

Therefore, the answers to your questions are:

Is it certain that the execution will always get into the if statement?

It depends on the implementation. The C++ language does not guarantee it.

Why does no-optimization has output 14, while -O3 has output 10 for the same instructions?

Because the behaviour is undefined when you dereference an invalid pointer.


In C, the comparison itself is undefined behaviour. Appendix J.2 in the C standard lists the circumstances in which the behaviour is undefined, and that list includes:

The value of a pointer to an object whose lifetime has ended is used.

You may find the following question including all comments and answers interesting: Undefined, unspecified and implementation-defined behavior

Community
  • 1
  • 1
Christian Hackl
  • 27,051
  • 3
  • 32
  • 62
  • 2
    Comparison is undefined behavior in C. Please see: https://stackoverflow.com/questions/26704344/why-does-misra-c-state-that-a-copy-of-pointers-can-cause-a-memory-exception/26704433#26704433 – 2501 Mar 02 '16 at 11:32
  • @2501 just to clarify: a (rendered) invalid pointer has an "indeterminate value", right? And accessing an indeterminate value always leads to undefined behaviour? – Dave5545 Mar 02 '16 at 11:57
  • @2501: I've added something for C. – Christian Hackl Mar 02 '16 at 11:59
  • @ChristianHackl how would you describe the *value* of an invalid pointer? indeterminate or unspecified? – Dave5545 Mar 02 '16 at 12:11
  • @Dave5545 In this case undefined behavior happens at compile time. Compiler is allowed, due to the definition of UB, to default the if statement to false.. – 2501 Mar 02 '16 at 12:12
  • @2501 Right, thanks. So, as the same question to Christian, after freeing a pointer, its value is indeterminate or unspecified? (apparently, the first one is a superset) – Dave5545 Mar 02 '16 at 12:33
0

Is it certain that the execution will always get into the if statement? How do we know the addresses of the two pointers, p and q, will be the same?

This is implementation defined, you cannot rely on this behaviour. p and q can indeed be equal, you have deallocated memory pointed by p, so q might get the same address as p.

Why does no-optimization has output 14, while -O3 has output 10 for the same instructions?

this is how optimizer works, you can see here your version:

https://goo.gl/yRfjIv

where compiler optimizes out assignment of 14, and here version where it looks correct:

https://goo.gl/vSVV0E

value 14 is being assigned, and I have added only one line p = q;

I am not sure why exactly it works like that, I would say that compiler assumes your code is free from undefined behaviour code and does optimizations under such assumption.

[edit]

The Undefined Behaviour is caused by the use of pointer value which compiler assumes is no longer valid, it does not matter if it is later on equal to some newly allocated memory block. The appropriate standard quote was given by TartanLlama :

[basic.stc.dynamic.safety]

[ Note: the effect of using an invalid pointer value (including passing it to a deallocation function) is undefined, see 3.7.4.2. This is true even if the unsafely-derived pointer value might compare equal to some safely-derived pointer value. —end note ]

marcinj
  • 48,511
  • 9
  • 79
  • 100
  • 1
    @Dave5545 because standard allows it to do it, it probably makes it easier to implement some optimizations. – marcinj Mar 02 '16 at 11:25
  • So,I guess, dangling pointers can validate if statements, but still cause undefined behaviour when deferenced at any point. – Dave5545 Mar 02 '16 at 11:31
  • Comparison is undefined behavior in C. Please see: https://stackoverflow.com/questions/26704344/why-does-misra-c-state-that-a-copy-of-pointers-can-cause-a-memory-exception/26704433#26704433 – 2501 Mar 02 '16 at 11:32
  • @2501 standard says about it : [basic.stc.dynamic.safety]: `38) Some implementations might define that copying an invalid pointer value causes a system-generated runtime fault.` – marcinj Mar 02 '16 at 11:38
  • @marcinj That quote is not from C (I'm pretty sure).. See the tags on the link and my comment. This question is (sadly) tagged with both. – 2501 Mar 02 '16 at 11:42
  • @2501 I am pretty sure it was originally tagged with C++. This quote is from c++ n4140 paper – marcinj Mar 02 '16 at 11:57
-3

The if-condition may be false -- depending on the particular implementation of malloc() it may return the just-freed block for reuse or a different one.

But, if the program prints anything (because it so happened that q equals p), it must print 14. A compiler producing anything else is buggy...

Using clang 3.4.1 and 3.6.2 here I consistently get the right answer, whereas both gcc 4.2.1 and 5.3.0 manifest the bug. Unfortunately, so does clang 3.8.0.

Mikhail T.
  • 3,043
  • 3
  • 29
  • 46