1

I wrote the following piece of code in order to test could it modify the constant c trough a pointer a. I have tested it in Clang, VC and GCC compilers and observed that with both VC and GCC code works as I would expect, it prints 6 to the standard output, while when I compile it with Clang the value is not getting modified, and 5 is printed to the standard output.

#include <stdio.h>

int main(void) {
  const int c = 5;
  int* a  = (int*) &c;
  *a = 6;
  printf("%d", c);
  return 0;
}

I am wondering is there any well known explanation for this, or it has to do with internals of the compilers and other stuff that would be hard to analyze. Thanks everyone in advance!

2 Answers2

7

Yes. It is called undefined behaviour. C11 6.7.3p6:

  1. If an attempt is made to modify an object defined with a const-qualified type through use of an lvalue with non-const-qualified type, the behavior is undefined.

With undefined behaviour explained as:

  1. undefined behavior

    behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements

  2. NOTE: Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).

(emphasis mine)


Compiling without optimization enabled is the same as buying a Ferrari and always driving on the first gear only. What happens if you use gcc and compile with -O3?

With -O3 my GCC produces code that is equivalent to

#include <stdio.h>

int main(void) {
    printf("%d", 5);
    return 0;
}

How about this program then:

#include <stdio.h>

int main(void) {
    const char *foo = "Hello world!";
    char *bar = (char*)foo;
    bar[0] = 'C';
    printf("%s\n", foo);
}

Using my GCC and -O3 it crashes. But if you use Clang, it will print Hello world!... and if you look at the assembly Clang figured out that it can be optimized to

 int main(void) {
     puts("Hello world!");
 }
  • 1
    I was hoping that there could be an underlined, low level, process that could explain it. I should have probably stated that in the question. Thanks a lot on your answer! – Đumić Branislav Mar 03 '21 at 19:39
  • But to be fair, how many C programmers read (or have ever read) the C Standard? – Andrew Mar 03 '21 at 19:42
  • 1
    @ĐumićBranislav that is the wrong attitude. My answer is right. There is no "low level" process. – Antti Haapala -- Слава Україні Mar 03 '21 at 19:43
  • 1
    @Andrew It took me about 15 of considering myself C programmer to acknowledge its existence. Since than I consider that I was very wrong about my C knowledge assumptions. – Eugene Sh. Mar 03 '21 at 19:48
  • 1
    Re “There is no "low level" process”: Ha ha ha ha ha ha. That’s right, it is magic; the compiled program appears out of nowhere. There is no process that generates it. And if there is a process that generates it, it is formless, without design. Then the program output magically appears too. It is not executed on a processor that has low-level instructions that manipulate physical entities, nor is there an operating system with any sort of design or consequential behavior. – Eric Postpischil Mar 03 '21 at 19:54
  • Well sure there *is* a process. But I argue that one can be a good C programmer without knowing the details of the process. The less you think you know the less you're going to assume any particular behaviour. – Antti Haapala -- Слава Україні Mar 03 '21 at 19:59
  • Ah, yes, ignorance as a strategy for success. Brilliant. Without knowing how compilers and operating systems work, one can be a mediocre programmer. The more one knows, the better and faster one can diagnose faults, comprehend behaviors, think of alternative approaches, and more. In any case, when a person asks for information, they should be given it, not told it does not exist. – Eric Postpischil Mar 03 '21 at 20:08
  • What is the best way to make someone do something? Tell them they are not allowed to do that! – Antti Haapala -- Слава Україні Mar 03 '21 at 20:17
2

Common possibilities with this code include:

  • The compiler sees that const int c = 5; means the only defined value of c is 5 and therefore generates code for printf("%d", c); that prints “5” without loading the value of c from memory or checking anything else about c.
  • The compiler puts c on the stack. Then int* a = (int*) &c; takes the address of c. For *a = 6;, the compiler generates code that writes 6 to the location that a points to, which is where c is stored. Due to how the stack is used, it is not practical to make it or parts of it read-only, so the processor does not generate any exception for this code that writes 6 to a location that is defined const in C. Then the printf("%d", c); fetches the value of c from memory, gets 6, and prints “6”.

None of this is defined by the C standard of course, but these behaviors arise as consequences of how compilers, hardware, and operating systems are designed.

You may be more likely to see the former behavior with high levels of optimization and the latter with optimization turned off, but it can vary.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • But how come the compiler puts c ( = 5) to the stack before the print statement is reached . Machine code should follow the same structure as the source code and given that the only branching is the jump to the print function, shouldn't the stack be untouched before the function call? – Đumić Branislav Mar 03 '21 at 19:44
  • 1
    @ĐumićBranislav most certainly **not**. Please read about the [as-if rule](https://stackoverflow.com/a/46455917/918959) – Antti Haapala -- Слава Україні Mar 03 '21 at 19:45
  • 1
    @ĐumićBranislav: Re “Machine code should follow the same structure as the source code”: There is no such rule. The only rule the C standard imposes on a C implementation is that it must produce the defined observable behavior of a program. It can generate any code it wants to that accomplishes that. – Eric Postpischil Mar 03 '21 at 19:45
  • 1
    @AnttiHaapala So, to be sure I understood this. What happens is actually, in essense, a consequence of compiler trying to optimize the code. Because of this the *c* is pushed right to the stack because the "const" modifier implies that in won't ever be modified? – Đumić Branislav Mar 03 '21 at 19:48
  • 1
    @ĐumićBranislav: Re “But how come the compiler puts c ( = 5) to the stack before the print statement is reached’: After the compiler has seen `const int c = 5;`, it is free to use 5 for `c` even if you attempt to change it. That is because your attempt to change it is not defined, so the compiler may ignore it. After `const int c = 5;`, when the compiler sees `printf("%d", c);`, it does not have to call `printf`; it may call `putchar('5');`, because that has the same observable behavior. One can argue a compiler should do that when optimizing, because `putchar` is cheaper than `printf`. – Eric Postpischil Mar 03 '21 at 19:48
  • 1
    @EricPostpischil That did not cross my mind. It actually makes a lot of sense. If I think about it, C compiler could optimize repeating code with branching even though there are no jumps in the initial code. Thanks a lot! – Đumić Branislav Mar 03 '21 at 19:49