17

In c it's possible to change const using pointers like so:

//mainc.c
#include <stdio.h>

int main(int argc, char** argv) {
    const int i = 5;
    const int *cpi = &i;

    printf("  5:\n");
    printf("%d\n", &i);
    printf("%d\n", i);
    printf("%d\n", cpi);    
    printf("%d\n", *cpi);   

    *((int*)cpi) = 8;
    printf("  8?:\n");
    printf("%d\n", &i);
    printf("%d\n", i);
    printf("%d\n", cpi);
    printf("%d\n", *cpi);
}

The constant is changed as can be seen in the output: mainc output

If we try the same in c++:

//main.cpp
#include <iostream>

using std::cout;
using std::endl;

int main(int argc, char** argv) {
    const int i = 5;
    const int *cpi = &i;

    cout << "  5:" << '\n';
    cout << &i << '\n';
    cout << i << '\n';
    cout << cpi << '\n';    
    cout << *cpi << '\n';   

    *((int*)cpi) = 8;
    cout << "  8?:" << '\n';
    cout << &i << '\n';
    cout << i << '\n';
    cout << cpi << '\n';
    cout << *cpi << '\n';

    int* addr = (int*)0x28ff24;
    cout << *addr << '\n';
}

The result is not so clear: main output

From the output is looks like i is still 5 and is still located at 0x28ff24 so the const is unchanged. But in the same time cpi is also 0x28ff24 (the same as &i) but the value it points to is 8 (not 5).

Can someone please explain what kind of magic is happening here?

Explained here: https://stackoverflow.com/a/41098196/2277240

Community
  • 1
  • 1
grabantot
  • 2,111
  • 20
  • 31
  • 19
    Undefined behaviour, I suppose. – Edgar Rokjān Dec 12 '16 at 08:25
  • 4
    While it is undefined behavior from the point of view of the language we can guess what happened in g++. I think that when you use `i` directly the compiler is using value 5 instead. At places where you access the variable through pointer, you get the modified value. However, on some other architecture, constants can be placed in a non-writable memory and an attempt to write there can be ignored by hardware or cause other failure. – Marian Dec 12 '16 at 08:45
  • 1
    I swear there exists a canonical duplicate for this FAQ, but I can't find it. – Lundin Dec 12 '16 at 08:47
  • 1
    Incidentally, `printf("%d\n", &i);` and `printf("%d\n", cpi);` both invoke undefined behavior, because you use the wrong format specifier. –  Dec 12 '16 at 09:51
  • @Lundin This one? [Why am I able to change the contents of const char \*ptr?](http://stackoverflow.com/questions/3228664/why-am-i-able-to-change-the-contents-of-const-char-ptr) – Dmitry Grigoryev Dec 12 '16 at 10:32
  • If you write `const int i = 5; const int j = 5`, I can imagine a clever compiler trying to reuse the same memory address for both. Just for fun: if you write `const char c = 'A'`, I can also imagine a *very* clever compiler looking for the byte `65` in memory it has already reserved for other purposes, and reusing its address. Potentially even in a page containing assembly code, on an architecture that doesn't require code/data pages separation. In this case changing a const could be catastrophic. – Federico Poloni Dec 12 '16 at 10:38
  • read the "Even more on Undefined Behavior" part of this answer: ofhttp://stackoverflow.com/a/34842262/2805305 – bolov Dec 12 '16 at 11:38
  • What is your optimization level while compiling this code ? The compiler may put the immediate value 5 instead of the i variable whenever it is read, because it is a const. You should generate the ASM output and observe the listing of this function ! – Malkocoglu Dec 12 '16 at 11:44
  • To me it seems quite clear that what happened is that the `i` used in the print statement after the assignment of 8 was substituted with the initial value 5. That makes sense since you're promising that `i` is a constant. – kalj Dec 13 '16 at 16:28

7 Answers7

34

The behaviour on casting away const from a variable (even via a pointer or a reference in C++) that was originally declared as const, and then subsequently attempting to change the variable through that pointer or reference, is undefined.

So changing i if it's declared as const int i = 5; is undefined behaviour: the output you are observing is a manifestation of that.

Bathsheba
  • 231,907
  • 34
  • 361
  • 483
  • Then either `&i` or `*cpi` must be lying and it doesn't look like it's the latter one. Should I never trust the address of a const? Is it some dummy address? – grabantot Dec 12 '16 at 08:30
  • 12
    To be extra-pedantic, casting away `const` is fine, but actually modifying the object is UB. This case can arise in some generic code. – Quentin Dec 12 '16 at 08:31
  • @Quentin: That's a good thing to point out. Thank you. – Bathsheba Dec 12 '16 at 08:34
  • 8
    @grabantot: If your program invokes undefined behavior, then you can't trust *anything at all*. You can't even trust things you think happened 'before' undefined behavior is invoked! –  Dec 12 '16 at 09:47
  • 2
    @grabantot Undefined behaviour means that literally anything can happen. The compiler doesn't even have to be consistent on its definition of "anything"! Not just in that line of the code either, any program which invokes undefined behaviour once will poison the entire run. The rationale behind this is to allow implementations to do whatever is "cheapest" in cases that shouldn't ever need to happen, and so to allow for optimisations. Never modify a variable originally declared as const! – Muzer Dec 12 '16 at 10:14
  • @grabantot see [what is undefined behaviour?](http://stackoverflow.com/a/4105123/1505939) – M.M Dec 12 '16 at 14:42
15

It is undefined behavior as per C11 6.7.3/6:

If an attempt is made to modify an object defined with a const-qualified type through use of an lvalue with non-const-qualified type, the behavior is undefined.

(C++ will have a similar normative text.)

And since it is undefined behavior, anything can happen. Including: weird output, program crashes, "seems to work fine" (this build).

Lundin
  • 195,001
  • 40
  • 254
  • 396
4

The rule of const_cast<Type *>() or c-type conversion (Type *):
The conversion is to remove const declaration, NOT to remove the const of the value (object) itself.

const Type i = 1;
// p is a variable, i is an object
const Type * p = &i; // i is const --- const is the property of i, you can't remove it
(Type *)p; // remove the const of p, instead the const of i ---- Here p is non-const but i is ALWAYS const!

Now if you try to change the value of i through p, it's Undefined Behavior because i is ALWAYS const.

When to use this kind of conversion?
1) If you can make sure that the pointed value is NOT const.
e.g.

int j = 1;
const int *p = &j;
*(int *)p = 2; // You can change the value of j because j is NOT const

2) The pointed value is const but you ONLY read it and NEVER change it.

If you really need to change a const value, please redesign you code to avoid this kind of case.

Yves
  • 11,597
  • 17
  • 83
  • 180
  • @M.M Thanks. In fact I just copy the sentences from The C++ Programming Language but my example is not suitable. I'll reedit my answer. – Yves Dec 12 '16 at 15:00
4

So after some thinking I guess I know what happens here. Though it is architecture/implementation dependent since it is undefined behaviour as Marian pointed out. My setup is mingw 5.x 32bit on windows 7 64 bit in case someone is interested.

C++ consts act like #defines, g++ replaces all i references with its value in compiled code (since i is a const) but it also writes 5 (i value) to some address in memory to provide acceses to i via pointer (a dummy pointer). And replaces all the occurences of &i with that adress (not exactly the compiler does it but you know what I mean).

In C consts are treated mostly like usual variables. With the only difference being that the compiler doesn't allow to change them directly.

That's why Bjarne Stroustrup says in his book that you don't need #defines in c++.

Here comes the proof: enter image description here

grabantot
  • 2,111
  • 20
  • 31
  • 7
    What exactly happens here isn't really important, the important part is it being undefined behaviour, in C++ and in C. Different compiler version, different operating system, those can all break the code and it's not portable. It could even break randomly on exactly the same system. – AliciaBytes Dec 12 '16 at 10:14
  • 1
    @RaphaelMiedl Maybe you're right but I wanted to get to the bottom of it and it was a bit disappointing to hear everyone say "it's undefined behaviour, just forget it". – grabantot Dec 12 '16 at 11:17
  • 1
    @RaphaelMiedl I find it very interesting and insightful. – Peter - Reinstate Monica Dec 12 '16 at 11:49
  • 3
    It is more important to understand "This is how it behaves" is an incorrect statement. "This is how it did behave once" is all you get. – Caleth Dec 12 '16 at 13:58
2

It's a violation of the strict aliasing rule (the compiler assumes that two pointers of different types never reference the same memory location) combined with compiler optimization (the compiler is not performing the second memory access to read i but uses the previous variable).

EDIT (as suggested inside the comments):

From the working draft of the ISO C++ standard (N3376):

"If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined [...] — a cv-qualified version of the dynamic type of the object, [...] — a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object, [...] — a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,"

As far as i understand it specifies, that a possibly cv-qualified type can be used as an alias, but not that a non cv qualified type for a cv qualified type can be.

tsp
  • 333
  • 3
  • 7
  • 1
    The strict aliasing rule refers to variables of different basic types. This problem does not have anything to do with strict aliasing. – Ralph Tandetzky Dec 12 '16 at 08:36
  • As far as i understand the (3.10) it's exactly strict aliasing. There it readsAs far as i understand it specifies, that a possibly cv-qualified type can be used as an alias, but not that a non cv qualified type for a cv qualified type can be. – tsp Dec 12 '16 at 08:47
  • Sorry the quote is missing: "If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined [...] — a cv-qualified version of the dynamic type of the object, [...] — a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object, [...] — a type that is a (possibly cv-qualified) base class type of the dynamic type of the object," – tsp Dec 12 '16 at 08:48
  • 1
    I'm inclined to agree, although the difficulty in answering this question comprehensively is not helped by the fact that both C and C++ need to be considered. Perhaps @tsp, you could start absorbing some of your answer comments into the answer itself. – Bathsheba Dec 12 '16 at 08:54
  • 1
    I had a more careful study of the standard and I'm realizing, that I was wrong. I'm very sorry for that. :( – Ralph Tandetzky Dec 12 '16 at 09:09
  • @tsp part of the confusion is your reference of (3.10) which may be fine for C++, but for C you are dealing with sections (6.5.6 and 6.5.7) and from the C standpoint, there isn't a violation as there is no other access of the pointer other than through its *declared type*. (ignoring `const` issue and the improper use of `%d` to print the pointer address) – David C. Rankin Dec 12 '16 at 09:17
  • 1
    I've made a trivial edit to allow downvoters opportunity to retract. – Bathsheba Dec 12 '16 at 09:25
  • It's permitted to use a non-const lvalue to read (but not write!) a const object, this has always been true in all versions of C and C++. The wording in the standard is not as clear as it could be. Also the wording has changed in more recent versions. – M.M Dec 12 '16 at 14:37
2

It would be more fruitful to ask what one specific compiler with certain flags set does with that code than what “C” or “C++” does, because neither C nor C++ will do anything consistently with code like that. It’s undefined behavior. Anything could happen.

It would, for example, be entirely legal to stick const variables in a read-only page of memory that will cause a hardware fault if the program attempts to write to it. Or to fail silently if you try writing to it. Or to turn a dereferenced int* cast from a const int* into a temporary copy that can be modified without affecting the original. Or to modify every reference to that variable after the reassignment. Or to refactor the code on the assumption that a const variable can’t change so that the operations happen in a different order, and you end up modifying the variable before you think you did or not modifying it after. Or to make i an alias for other references to the constant 1 and modify those, too, elsewhere in the program. Or to break a program invariant that makes the program bug out in totally unpredictable ways. Or to print an error message and stop compiling if it catches a bug like that. Or for the behavior to depend on the phase of the moon. Or anything else.

There are combinations of compilers and flags and targets that will do those things, with the possible exception of the phase-of-the-moon bug. The funniest variant I’ve heard of, though, is that in some versions of Fortran, you could set the constant 1 equal to -1, and all loops would run backwards.

Writing production code like this is a terrible idea, because your compiler almost certainly makes no guarantees what this code will do in your next build.

Davislor
  • 14,674
  • 2
  • 34
  • 49
0

The short answer is that C++ 'const' declaration rules allow it to use the constant value directly in places where C would have to dereference the variable. I.e, C++ compiles the statement

cout << i << '\n';

as if it what was actually written was

cout << 5 << '\n';

All of the other non-pointer values are the results of dereferencing pointers.

PMar
  • 11