why not allow common subexpression elimination on const nonvolatile member functions?

Question

One of the goals of C++ is to allow user-defined types to behave as nicely as built-in types. One place where this seems to fail is in compiler optimization. If we assume that a const nonvolatile member function is the moral equivalent of a read (for a user-defined type), then why not allow a compiler to eliminate repeated calls to such a function? For example

class C {
...
public:
    int get() const;
}

int main() {
    C c;
    int x{c.get()};
    x = c.get(); // why not allow the compiler to eliminate this call
}

The argument for allowing this is the same as the argument for copy elision: while it changes the operational semantics, it should work for code that follows good semantic practice, and provides substantial improvement in efficiency/modularity. (In this example it is obviously silly, but it becomes quite valuable in, say, eliminating redundant iterative safety checks when functions are inlined.)

Of course it wouldn't make sense to allow this for functions that return non-const references, only for functions that return values or const references.

My question is whether there is a fundamental technical argument against this that doesn't equally apply to copy elision.

Note: just to be clear, I am not suggesting the compiler look inside of the definition of get(). I'm saying that the declaration of get() by itself should allow the compiler to elide the extra call. I'm not claiming that it preserves the as-if rule; I'm claiming that, just as in copy elision, this is a case where we want to allow the compiler to violate the as-if rule. If you are writing code where you want a side effect to be semantically visible, and don't want redundant calls to be eliminated, you shouldn't declare your method as const.

"operational semantics" <=> "observable behavior"? The latter is standardese, and the copy-ellision rule is an exception from the as-if rule, which says ((too?) much abbreviated) that the implementation must respect observable behavior. — Deduplicator, May 18 '14 at 13:49
Copy elision may elide/change observable side-effects. The compiler is allowed any optimization that does not change the observable side-effects under the "as-if"-rule [intro.execution]. — dyp, May 18 '14 at 13:50
Can a compiler practically determine that a function is non-volatile? It seems like it might not be so easy. — nobody, May 18 '14 at 13:50
What about `mutable` state? `get` might for example do a thing as simple as counting its calls in a mutable counter. Or modify some other global state, which is also not forbidden in `const` member functions IIRC. — Bartek Banachewicz, May 18 '14 at 13:51
@BartekBanachewicz Not just mutable state, but also namespace-scope and class static variables, pointer/reference members etc. can modify the state of the program even in `const` member functions. There's also the notion of a *pure* function which doesn't have side effects, but AFAIK this notion is not standardized yet in C++. — dyp, May 18 '14 at 13:53
Might be interesting with `constexpr` functions... (would that be pure?) — Deduplicator, May 18 '14 at 13:54
@AndrewMedico you just have to solve the halting problem, right? ;-) It might be practical in non-self-modifying code, but I suspect that C++'s type system has too many escape hatches to make this easy to prove at compile time. — Rook, May 18 '14 at 13:59
@Deduplicator Maybe I should add that a *pure* function must also not rely on global state. IIRC `constexpr` functions can use the global state, and also modify it on a "non-constant-expression-path" - I mean, you can also have a `throw` in a constexpr function if there's an execution path in that function that evaluates to a constant expression, like `constexpr int foo(int p) { return p > 0 ? 42 : throw "what?"; }` — dyp, May 18 '14 at 14:05
Mutable should not be a problem; the whole point of mutable is to allow const methods that change the state of an object without changing its semantic state (e.g., caching the result). The point is that if a const operation is supposed to be semantically transparent, you should not be using it to produce a side effect. — user2949652, May 18 '14 at 14:26
I should probably restrict the suggestion to apply only to methods that return constants, or return by value. Obviously you don't want a function returning a reference to an iterator to return the same one if you call it twice. — user2949652, May 18 '14 at 14:29
A "pure" attribute might help. GCC has one of those, I think. — Kerrek SB, May 18 '14 at 14:39
Suppose `C::get` were actually `C::get_and_log`, that is, it makes a note in a log file in addition to returning the value. It would be a perfectly reasonable and useful function, and you'd still want to mark it const because it doesn't change the object, but it would have an observable side effect. You'd need a way to distinguish between these besides the `const` annotation. — Adrian McCarthy, May 18 '14 at 14:50
The "logging" argument would equally apply if you wanted to log how many times you called a copy constructor, right? So why doesn't this argument apply to copy elision? — user2949652, May 18 '14 at 15:00
`One of the goals of C++ is to allow user-defined types to behave as nicely as built-in types` Where did you hear that? — Lightness Races in Orbit, May 18 '14 at 16:45
@AdrianMcCarthy it would not make a log entry, because it would not have been called. I think that would be consistent, in this case. That is, it's not a side effect needed in isolation from the call. — Johannes Schaub - litb, May 18 '14 at 20:27

Adrian McCarthy · Answer 1 · 2014-05-18T15:47:31.833

New answer based on clarification on the question

C::get would need a stronger annotation than const. As it stands today, the const is a promise that the method doesn't (conceptually) modify the object. It makes not guarantees about interaction with global state or side effects.

Thus if the new version of the C++ standard carved out another exception to the as-if rule, as it did for copy elision, based solely on the fact that a method is marked const, it would break a lot of existing code. The standards committee seems to try pretty hard not to break existing code.

(Copy elision probably broke some code, too, but I think it's actually a pretty narrow exception compared to what you're proposing.)

You might argue that we should re-specify what const means on a method declaration, giving it this stronger meaning. That would mean you could no longer have a C::print method that's const, so it seems this approach would also break a lot of existing code.

So we would have to invent a new annotation, say pure_function. To get that into the standard, you'd have to propose it and probably convince at least one compiler maker to implement it as an extension to illustrate that it's feasible and useful.

I suspect that the incremental utility is pretty low. If your C::get were trivial (no interaction with global state and no observable side effects), then you may as well define it in the class definition, thus making it available for inlining. I believe inlining would allow the compiler to generate code as optimal as a pure_function tag on the declaration (and maybe even more so), so I wouldn't expect the incremental benefit of a pure_function tag to be significant enough to convince the standards committee, compiler makers, and language users to adopt it.

Original answer

C::get could depend on global state and it might have observable side effects, either of which would make it a mistake to elide the second call. It would violate the as-if rule.

The question is whether the compiler knows this at the time it's optimizing at the call site. As your example is written, only the declaration of C::get is in scope. The definition is elsewhere, presumably in another compilation unit. Thus the compiler must assume the worst when it compiles and optimizes the calling code.

Now if the definition of C::get were both trivial and in view, then I suppose it's theoretically possible for the compiler to realize there are no side effects or non-deterministic behavior, but I doubt most optimizers get that aggressive. Unless C::get were inlined, I imagine there would be an exponential growth in the paths to analyze.

And if you want to skip the entire assignment statement (as opposed to just the second call of C::get), then the compiler would also have to examine the assignment operator for side effects and reliance on global state in order to ensure the optimization wouldn't violate the as-if rule.

This is why the question explicitly stated "that doesn't apply equally to copy elision". Copy elision also violates the as-if rule; the point is to take advantage of meaning beyond the naive language semantics, that copy constructors are supposed to make copies, and that the programmer shouldn't take advantage of side effects of the copying process. — user2949652, May 18 '14 at 14:34
So the question isn't why the compiler doesn't do the optimization, it's why the standard doesn't carve out another exception to the as-if rule as it does for copy elision? — Adrian McCarthy, May 18 '14 at 14:46
*"To get that into the standard, you'd have to propose it [...]"* Oh, you mean like [n3477: Proposing pure](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3744.pdf)? — dyp, May 18 '14 at 16:57
That's not entirely true. Remember that the Standard library interprets `const` as having thread-safety guarantees. — Puppy, May 18 '14 at 17:18
*"it would break a lot of existing code. The standards committee seems to try pretty hard not to break existing code."*, can you give examplexs of reasonable `const` functions that modify global state or that have side effects that will break the program if omitted? — Johannes Schaub - litb, May 18 '14 at 20:22
The Committee cannot retain compatibility with only reasonable code. — Puppy, May 18 '14 at 20:49
@JohannesSchaub-litb: Your slot machine software has to make three calls to a random number generator to figure out where the reels should stop. The random number generator class is a wrapper around a hardware RNG that's constantly generating new values at high speed (that is, it doesn't generate them on demand). The method to get the current value would be const, since calling it doesn't affect the state of the RNG (the state is changing on its own, independently). If the second and third calls are elided, the second and third reels will always spin to the same position as the first. — Adrian McCarthy, May 19 '14 at 22:27
Your random number generator should be declared volatile, which would exempt it from the above suggestion. — user2949652, May 20 '14 at 18:51

score 0 · Answer 2 · edited May 23 '17 at 12:05

0

First of all const-ness of methods (or of references) is totally irrelevant for the optimizer, because constness can be casted away legally (using const-cast) and because, in case of references, there could be aliasing. Const correctness has been designed to help programmers, not the optimizer (another issue is if it really helps or not, but that's a separate unrelated discussion).

Moreover to elide a call to a function the optimizer would also need to be sure that the result doesn't depend and doesn't influence global state.

Compilers sometimes have a way to declare that a function is "pure", i.e. that the result depends only on the arguments and doesn't influence global state (like sin(x)), but how you declare them is implementation dependent because the C++ standard doesn't cover this semantic concept.

Note also that the word const in const reference describes a property of the reference, not of the referenced object. Nothing is known about the const-ness of an object that you're given a const reference of and the object can indeed change or even go out of existence while you have the reference still in your hands. A const reference means simply that you cannot change the object using that reference, not that the object is constant or that it will be constant for a while.

For a description of why a const reference and a value are two very different semantic concepts and of the subtle bugs you can meet if you confuse them see this more detailed answer.

edited May 23 '17 at 12:05

Community

1
1

answered May 18 '14 at 14:38

6502

112,025
15
165
265

See the note at the end of the question; just because const-ness *is* irrelevant for the optimizer doesn't mean that it should be. Being a copy constructor also used to be irrelevant for the optimizer, but isn't anymore. The optimizer would not have to worry about the global state any more than it would have to with copy constructors. The question was edited to eliminate the aliasing issue. – user2949652 May 18 '14 at 14:54
@user2949652: Copy construction can be elided in special and detailed circumnstances, not in general. In general the optimizer is not allowed to do anything that changes the semantic and it's good it's this way. C++ has already too many traps for UB (and C++11 added more) to also drop in the mix a liberal optimizer that can base its behavior on the lousy concept of const correctness. – 6502 May 18 '14 at 15:00
Regarding const references, not that this issue is just as much an issue for a reference to an int. Optimizers are nevertheless (sometimes) able to eliminate unnecessary reads of intseven though doing so requires substantial effort (e.g., antialias analysis). I'm just trying to make it possible for the compiler to do the same thing for user-defined types. – user2949652 May 20 '14 at 18:54
@user2949652: even with just `void foo(const int &x)` the main problem is that inside `foo` nothing says that the integer will remain constant. The only semantic meaning that the word `const` has in this context is that mutation of the referenced integer cannot be done using `x`, not that the integer cannot be mutated in other ways. A const reference is only an help for the programmer to avoid mistakes, not for the compiler to generate better code (to say it better it was **designed** to be an help for the programmer, but I'm personally not sure at all it's such a big help indeed). – 6502 May 20 '14 at 21:18
You are missing the point. The issue here is not whether a const reference to an object keeps it from changing (it doesn't). The issue is whether the compiler should be allowed to eliminate invocation of a const *method* if it can ascertain that it has no relevance other than its side effect. – user2949652 May 23 '14 at 08:45
@user2949652: You are missing the point. The compiler cannot avoid calling a const method because the fact that is `const` doesn't really tell anything about if it changes the global state or if it depends on the global state. If you have a method that is const but that does `glob++` where `glob` is a global variable clearly the call cannot be skipped. Same if the the const method instead does `return glob` and calling some other code will increment the variable. The fact that the method is `const` is totally irrelevant, what counts if it's "pure" but this concept is not present in C++. – 6502 May 23 '14 at 09:05
You are missing the point (or at least, my point). This has nothing to do with purity. Purity has no effect on the observable behavior. I'm suggesting a change that *does* have an effect on the allowed observable behaviors. That change is: if you declare a method as const, you should be okay with a call to be eliminated under certain circumstances, just like you should be willing to allow copy construction to be eliminated under certain circumstances. – user2949652 May 30 '14 at 03:54
@user2949652: what you're looking for is something like the `pure` attribute of g++ (see https://gcc.gnu.org/onlinedocs/gcc-4.9.0/gcc/Function-Attributes.html) but for instance methods. This is of course an interesting concept (like e.g. the `malloc` attribute) but it's not part of standard C++ and IMO will not be anytime soon. Mixing up this concept with `const` is however in my opinion a very bad idea and would break most non-trivial programs. – 6502 May 30 '14 at 06:47

score 0 · Answer 3 · answered May 18 '14 at 16:39

The first answer to your question from Adrian McCarthy was just about as clear as possible:

The const-ness of a member function is a promise that no modification of externally visible state will be made (baring mutable variables in an object instance, for example).

You would expect a const member function which just reported the internal state of an object to always return the same answer. However, it could also interact with the ever changing real world and return a different answer every time.

What if it is a function to return the current time?

Let us put this into an example.

This is a class which converts a timestamp (double) into a human readable string.

class time_str {
    // time and its format
    double time;
    string time_format;
public:
    void set_format(const string& time_format);
    void set_time(double time);
    string get_time() const;
    string get_current_time() const;
};

And it is used (clumsily) like so:

time_str a;
a.set_format("hh:mm:ss");
a.set_time(89.432);
cout << a.get_time() << endl;

So far so good. Each invocation to a.get_time(); will return the same result.

However, at some point, we decide to introduce a convenience function which returns the current time in the same format:

cout << a.get_time() << " is different from " << a.get_current_time() << endl;

It is const because it doesn't change the state of the object in any way (though it accesses the time format). However, obviously each call to get_current_time() must return a different answer.

*"What if it is a function to return the current time?"*. Did you invent a time machine here? If I don't ask for the time anymore from now on, can I stop the world clock advancing? Fun aside, I don't see something wrong if it returned the current time and the compiler omitted calling it :) — Johannes Schaub - litb, May 18 '14 at 20:25
It's perfectly fine to declare a function that returns the current time const. However, it should also be marked volatile, and so the proposal above would not apply. — user2949652, May 18 '14 at 23:13
I don't think it should be volatile at all. Volatility implies that someone unbeknownest to the user has write access to the internal state of the object. Quoting from the c++ standard: "volatile is a hint to the implementation to avoid aggressive optimization involving the object because the value of the object might be changed by means undetectable by an implementation". In this case, No one is changing any internal volatile state of the object. The object state never changes! What changes is the external context in which the function is called. — Leo Goodstadt, Jul 25 '14 at 01:36
@Leo: The standard defines volatility by saying that volatile accesses have to happen in program order. That includes situations where the object has no internal state, e.g. writing to MMIO to send an IPI. So it makes perfect sense to declare a time function as volatile. — user2949652, Feb 23 '17 at 18:35

why not allow common subexpression elimination on const nonvolatile member functions?

3 Answers3

Linked