6

According to the C++ Standard it's okay to cast away const from the pointer and write to the object if the object is not originally const itself. So that this:

 const Type* object = new Type();
 const_cast<Type*>( object )->Modify();

is okay, but this:

 const Type object;
 const_cast<Type*>( &object )->Modify();

is UB.

The reasoning is that when the object itself is const the compiler is allowed to optimize accesses to it, for example, not perform repeated reads because repeated reads make no sense on an object that doesn't change.

The question is how would the compiler know which objects are actually const? For example, I have a function:

void function( const Type* object )
{
    const_cast<Type*>( object )->Modify();
}

and it is compiled into a static lib and the compiler has no idea for which objects it will be called.

Now the calling code can do this:

Type* object = new Type();
function( object );

and it will be fine, or it can do this:

const Type object;
function( &object );

and it will be undefined behavior.

How is compiler supposed to adhere to such requirements? How is it supposed to make the former work without making the latter work?

Community
  • 1
  • 1
sharptooth
  • 167,383
  • 100
  • 513
  • 979
  • Why do you make a promise if you intend to break it right away? `const` is a promise from the programmer to the compiler (and a contract that other programmers reusing the component agree on), no more and no less. The compiler _may or may not_ do something differently according to that promise, but that is circumstantial. Now, the thing is, if something is not constant, you should not give that promise in the first place. – Damon Dec 16 '11 at 10:28
  • @Damon: In real life one party writes the function, the other writes the calling code and they can't affect each other. – sharptooth Dec 16 '11 at 10:37
  • @Daemon There are case where you do keep the promise -- that is, the object is unchanged when the function ends -- but you make temporary changes to it during execution, for various reasons. – Paul Manta Dec 16 '11 at 10:46

5 Answers5

6

When you say "How it is supposed to make the former work without making the latter work?" an implementation is only required to make the former work, it needn't - unless it wants to help the programmer - make any extra effort in trying to make the latter not work in some particular way. The undefined behavior gives a freedom to the implementation, not an obligation.

Take a more concrete example. In this example, in f() the compiler may set up the return value to be 10 before it calls EvilMutate because cobj.member is const once cobj's constructor is complete and may not subsequently be written to. It cannot make the same assumption in g() even if only a const function is called. If EvilMutate attempts to mutate member when called on cobj in f() undefined behavior occurs and the implementation need not make any subsequent actions have any particular effect.

The compiler's ability to assume that a genuinely const object won't change is protected by the fact that doing so would cause undefined behavior; the fact that it does, doesn't impose additional requirements on the compiler, only on the programmer.

struct Type {
    int member;
    void Mutate();
    void EvilMutate() const;
    Type() : member(10) {}
};


int f()
{
    const Type cobj;
    cobj.EvilMutate();
    return cobj.member; 
}

int g()
{
     Type obj;
     obj.EvilMutate();
     return obj.member; 
}
CB Bailey
  • 755,051
  • 104
  • 632
  • 656
3

The compiler can perform optimization only on const objects, not on references/pointers to const objects (see this question). In your example, there is no way the compiler can optimize function, but he can optimize the code using a const Type. Since this object is assumed by the compiler to be constant, modifying it (by calling function) can do anything, including crashing your program (for example if the object is stored in read-only memory) or working like the non-const version (if the modification does not interfere with the optimizations)

The non-const version has no problem and is perfectly defined, you just modify a non-const object so everything is fine.

Community
  • 1
  • 1
Luc Touraille
  • 79,925
  • 15
  • 92
  • 137
  • The compiler can optimize `function` if it inlines the call, or creates a separate definition that must only be called for objects defined as const. Both possibilities are becoming more and more likely, nowadays even if `function` is defined in a separate translation unit. –  Dec 16 '11 at 12:28
  • @hvd: you are right, I overlooked inlining since it is not really an optimization of `function` per se, but the possibility of having two versions of a function depending on the constness of the object given did not come to my mind and is very interesting. – Luc Touraille Dec 16 '11 at 13:40
2

If an object is declared const, an implementation is allowed to store it in such a way that attempts to modify it could cause hardware traps, without having any obligation to ensure any particular behavior for those traps. If one constructs a const pointer to such an object, recipients of that pointer will not generally be allowed to write it, and would thus be in no danger of triggering those hardware traps. If code casts away the const-ness and writes to the pointer, a compiler would be under no obligation to protect the programmer against any hardware oddities that might occur.

Further, in the event that a compiler can tell that a const object is always going to contain a particular sequence of bytes, it could inform the linker of that, and allow the linker to see if that sequence of bytes occurs anywhere in the code and, if so, regard the address of the const object as being the location of that sequence of bytes (complying with various restrictions about different objects having unique addresses might be a little tricky, but it would be permissible). If the compiler told the linker that a const char[4] was always supposed to contain a sequence of bytes that happened to appear within the compiled code for some function, a linker could assign to that variable the address within the code where that byte sequence appears. If the const was never written, such behavior would save four bytes, but writing to the const would arbitrarily change the meaning of the other code.

If writing to an object after casting away const was always UB, the ability to cast away const-ness wouldn't be very useful. As it is, the ability often plays a role in situations where a piece of code holds onto pointers--some of which are const and some of which will need to be written--for the benefit of other code. If casting away the const-ness of const pointers to non-const objects weren't defined behavior, the code which is holding the pointers would need to know which pointers are const and which ones will need to be written. Because const-casting is allowed, however, it is sufficient for the code holding the pointers to declare them all as const, and for code which knows that a pointer identifies a non-const object and wants to write it, to cast it to a non-cast pointer.

It might be helpful if C++ had forms of const (and volatile) qualifiers which could be used on pointers to instruct the compiler that it may (or, in the case of volatile, should) regard the pointer as identifying a const and/or volatile object even if the compiler knows that the object is, and knows that it isn't const and/or isn't declared volatile. The former would allow a compiler to assume that the object identified by a pointer wouldn't change during a pointer's lifetime, and cache data based upon that; the latter would allow for cases where a variable may need to support volatile accesses in some rare situations (typically at program startup) but where the compiler should be able to cache its value after that. I know of no proposals to add such features, though.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • *"If writing to an object after casting away const was always UB, the ability to cast away const-ness wouldn't be very useful."* IIRC `const_cast` was introduced to deal with "legacy" APIs that are not const-correct; i.e. to deal with cases where a function does not modify the object pointed to, but doesn't take a `T const*` but a `T*`. (D&E uses `strchr` as an example) – dyp Jul 23 '15 at 16:25
  • @dyp: The `strchr` function is a nice example of something which handles pointers which might or might not be const for the benefit of other code which might or might not need to write to them. In the days before templates, it may have been worthwhile to have separate const- and non-const implementations for some very-frequently-used methods, but having to code all such functions twice would have been sufficiently painful that almost any kludge to accomplish a const-cast would have been justifiable. Once templates were added things might have been less painful at the source code level, but... – supercat Jul 23 '15 at 16:37
  • *"which handles pointers which might or might not be const for the benefit of other code which might or might not need to write to them"* `strchr` was designed well before `const` made it into C or C++. D&E suggests in the said example to introduce an overload `char const* strchr(const char* p, char c) { return strchr(const_cast(p), c); }` Later, Stroustrup even writes "Note that the result of casting away `const` from an object originally defined `const` is undefined (§13.3)" which deviates from today's rules, but illuminates the original purpose of `const_cast`. – dyp Jul 23 '15 at 16:40
  • ...compilation times and code size would still have been burdened by the need to compile separate const-pointer and non-const-pointer versions of a lot of methods (even if `char *foo(char*)` and `char const *foo(char const*)` perform the same action, I think the C++ standard would require that their addresses compare as distinct; thus, if `char *bar(char*)` and `char const *bar(char const*)` call the above methods, their code couldn't match unless the linker kept track of a "real" address and a "reported" address for each function (with the latter identifying a JMP to the real one). – supercat Jul 23 '15 at 16:40
  • D&E actually suggests that `strchr` overload is `inline`; so the compiler should only export it if it is indeed not inlined. However, due to the function being essentially a no-op, I think this is quite unlikely (<=> it will most likely be inlined). Yes, it will impact compilation times, but simplify const correctness. Just a trade-off. (Interestingly, TC++PL says `const_cast` is used "for getting write access to something declared as `const`") – dyp Jul 23 '15 at 16:49
  • @dyp: The `strchr` overload you show makes use of a const-cast. One could perhaps minimize code duplication for the particular case of `strchr` by starting with a version with a `const` argument/return and then `char * strchr(char *src, char c) { const char *r = strchr((char const*)src, ch); return r ? src+(r-src) : 0;}`, but in most cases code duplication would end up being highly contagious. – supercat Jul 23 '15 at 17:19
  • The overload is supposed to be a C++ addition to the (unchanged and unchangeable?) C standard library. That is, `char* strchr(char*, char)` is the legacy API, and we wrap it via `char const* strchr(char const*, char)` for the cases where the argument is a `char const*` and we want const-correctness. A better example would be some `strcpy(char*, char*)`, which we would wrap via `strcpy(char* d, char const* s) { return strcpy(d, const_cast(s)); }` – dyp Jul 23 '15 at 18:38
  • @dyp: My point is that the need for const-casting would exist *even without legacy APIs*. If C's `strchr` didn't exist, but *if const-casting weren't available* and one wanted to offer a C++ version with overloads for both const-ness forms, trying to avoid duplicated code would be difficult. In some cases it may be possible to have a "non-const" wrapper around a "const" function *even without the ability to const-cast*, but const-casting would generally be necessary to minimize code duplication even if legacy APIs weren't an issue. – supercat Jul 23 '15 at 18:55
  • I agree you can use const-casting for DRY (even though I typically avoid it in favour of other solutions like forwarding to a function template). However, I disagree with your assessment in your answer that *"If writing to an object after casting away `const` was always UB, the ability to cast away const-ness wouldn't be very useful"*. As I described earlier, I think one of the main reasons why casting away constness has been allowed has nothing to do with regaining write-access through a const-qualified pointer (`strchr` with const; legacy APIs). – dyp Jul 26 '15 at 07:24
  • @dyp: If the fact that a pointer was passed to `strchr` as a `const char*` meant that writing the pointer returned from it would be forbidden despite the return type, then it would be necessary to have a separate method for the same purpose which accepted and returned a `char*` and never converted it to a `const char*`. Of course, using a separate method wouldn't satisfy the need for compatibility, but even if that weren't an issue the violation of DRY would still be a legitimate objection to such a rule. As for templates, I'm not sure to what extent they would be able to avoid duplicate... – supercat Jul 26 '15 at 19:18
  • ...machine code. My understanding is that the addresses of `const char* foo(const char *)` and `char *foo(char *)` are required to be distinct; while it might be possible for a compiler to generate two entry points with JMP instructions to one implementation, and have any direct calls simply invoke the shared implementation directly, if a compiler can't have separate addresses for "get pointer to function" and "call function", the two implementations of a function which accepts either a `char*` or `const char*` and passes the pointer to a suitable overload of `foo` would end up... – supercat Jul 26 '15 at 19:22
  • I'm sorry, I don't quite understand your latest example: *"despite the return type*" A const-correct `strchr` forwards the cv-qualification of the first argument. So if you pass a `const char*`, you get a `const char*` back. If `const_cast` couldn't remove constness for writing, then it would be forbidden to write through the return value *because* of the return type, not *despite* it. What exactly is the signature of the `strchr` you're talking about? – dyp Jul 26 '15 at 19:23
  • ...being slightly different, thus requiring that the compiler generate two functionally-identical functions. Being able to cast away `const` avoids the need for such duplication. – supercat Jul 26 '15 at 19:23
  • @dyp: Given `char const foo1[] = "Foo!"; and `char foo2[] = "Foo!";`, in order to allow both `if (strchr(foo1,ch)) ...` and `*strchr(foo2,'!')='?';` it is necessary not only that `strchr` accept a `const char*` and cast away the `const` before returning the result, but also that casting away `const` makes the resulting pointer writable. – supercat Jul 26 '15 at 19:27
  • @dyp: Further, if casting away `const` wouldn't enable writing a `const` pointer to a non-`const` object, then a method like `char *findExclamationMark(char const *st) { return strchr(st, '!'); }` would also need to have two versions--one for a `const char*` and one for a non-const `char*`. – supercat Jul 26 '15 at 19:30
  • Sorry, I still don't quite understand why you're starting with (I assume) some `char const* strchr(char const*, char)` instead of historical case `char* strchr(char*, char)`. With the latter, you don't need the *cast const away and be writeable* property. Also, the wrapper *is not exported if it is inlined*. I have tried to demonstrate both in this demo: http://coliru.stacked-crooked.com/a/407cfd846cb9bac8 – dyp Jul 26 '15 at 19:36
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/84322/discussion-between-dyp-and-supercat). – dyp Jul 26 '15 at 19:54
1

Undefined behavior means undefined behavior. The specification makes no guarantees what will happen.

That doesn't mean it won't do what you intend. Just that you're outside of the boundary of behavior that the specification states should work. The specification is there to say what will happen when you do certain things. Outside of the protection of the spec, all bets are off.

But just because you're off the edge of the map does not mean that you will encounter a dragon. Maybe it'll be a fluffy bunny.

Think of it like this:

class BaseClass {};
class Derived : public BaseClass {};

BaseClass *pDerived = new Derived();
BaseClass *pBase = new Base();

Derived *pLegal = static_cast<Derived*>(pDerived);
Derived *pIllegal = static_cast<Derived*>(pBase);

C++ defines one of these casts to be perfectly valid. The other yields undefined behavior. Does that mean that a C++ compiler actually checks the type and flips the "undefined behavior" switch? No.

It means is that the C++ compiler will more than likely assume that pBase is actually a Derived and therefore perform the pointer arithmetic needed to convert the pBase into a Derived*. If it isn't actually a Derived, then you get undefined results.

That pointer arithmetic may in fact be a no-op; it may do nothing. Or it may actually do something. It doesn't matter; you are now outside of the realm of behavior defined by the specification. If the pointer arithmetic is a no-op, then everything may appear to work perfectly.

It's not that the compiler "knows" that in one instance it's undefined and in another it's defined. It's that the specification does not say what will happen. It may appear to work. It may not. The only times that it will work are when it is done properly in accord with the specification.

The same goes for const casts. If the const cast is from an object that was not originally const, then the spec says that it will work. If it's not, then the spec says that anything can happen.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • 3
    I can't agree about "all cases" - it's okay to cast away const if the object is not originally const. – sharptooth Dec 16 '11 at 06:44
  • Where does the specification say that? Where does it say that you can cast away `const` if the object wasn't "originally" `const`? – Nicol Bolas Dec 16 '11 at 07:04
  • This answer has a Standard reference http://stackoverflow.com/a/1542272/57428 - 7.1.5.1/4 – sharptooth Dec 16 '11 at 07:08
  • 2
    If casting away `const` was always undefined behavior, do you think the language would provide `const_cast`? – Luc Touraille Dec 16 '11 at 09:03
  • @LucTouraille: Being able to cast away const-ness is useful in two scenarios: (1) One wants to pass a const to a function which takes a non-const pointer parameter, but won't actually write to it; (2) a function takes a pointer to something that may or may not be const, has some means outside the pointer of knowing whether it is in fact const, and may want to write to it if it isn't. Casting away const in either scenario could be useful even if the other scenario was UB. In fact, both scenarios are okay. – supercat Nov 29 '12 at 22:47
0

In theory, const objects are allowed to be stored in read-only memory in some cases, which would cause obvious problems if you try to modify the object, but a more likely case is that if at any point the definition of the object is visible, so that the compiler can actually see that the object is defined as const, the compiler can optimise based on the assumption that members of that object do not change. If you call a non-const function on a const object to set a member, and then read that member, the compiler could bypass the read of that member if it already knows the value. After all, you defined the object as const: you promised that that value wouldn't change.

Undefined behaviour is tricky in that it often seems to work as you expect, until you make one slight modification.