105

I read this line in a book:

It is provably impossible to build a compiler that can actually determine whether or not a C++ function will change the value of a particular variable.

The paragraph was talking about why the compiler is conservative when checking for const-ness.

Why is it impossible to build such a compiler?

The compiler can always check if a variable is reassigned, a non-const function is being invoked on it, or if it is being passed in as a non-const parameter...

Adam Stelmaszczyk
  • 19,665
  • 4
  • 70
  • 110
Cricketer
  • 2,391
  • 4
  • 24
  • 29
  • 25
    First thing that comes to my mind is dynamic link libraries. If I compile code on my machine, and you compile code on your machine, and we link them _at run time_, how could your compiler know if I modified variables or not? – Mooing Duck Jul 01 '13 at 17:20
  • If I ask for input in my program and based on that, I either modify or not not modify a given variable. Then it can never be predicted at compile-time whether I will change the value of that variable not. It might, but you can't say for sure it "will". – Aleph Jul 01 '13 at 17:23
  • 4
    @MooingDuck Exactly this. More broadly, the compiler does not compile the function individually, but compiles it as part of a broader picture which may not all be within the compiler's scope. – called2voyage Jul 01 '13 at 17:24
  • but i am not getting why anyone will want to know that the function is changing some values during its execution,as there are some other ways of keeping the track of the state of variable or its value. – nobalG Jul 01 '13 at 17:29
  • @ MooingDuck Does object code mark if a particular function is const? Also, in your header file, if you marked one or more of your functions as const or one of the parameters const, then the compiler can determine if your code modifies variables or not. Right? If yes, then even if you compiled your code on your machine and gave me your header, my compiler can determine whether the variable changed or not. Right? – Cricketer Jul 01 '13 at 17:35
  • 3
    "impossible" may be an overstatement - "computationally infeasible" (as in NP-hard) may be a better characterization, but is a little harder for the student to grasp. Imagine a linked list or other abstract data structure. If I call a function that changes one node in that list/tree/whatever, how could a compiler ever hope to prove exactly which node got modified (and maybe more importantly, which ones didn't) without basically fully simulating the program with the expected input, all while not taking 3 days to compile one source file... – twalberg Jul 01 '13 at 17:36
  • 37
    @twalberg Impossible is not an overstatement, the Halting problem applies here as several answers explain. It is simply not possible to algorithmically fully analyze a general program. – Fiktik Jul 01 '13 at 18:42
  • 2
    @Fiktik Agreed for the completely general/arbitrary case. However, many well-constructed programs doing proper input validation don't have to deal with "arbitrary input", and so may not be instances of the full halting problem, but some reduced subset that may very well be provable, although not necessarily efficiently... – twalberg Jul 01 '13 at 18:55
  • 5
    @twalberg Compilers that only compile a subset of valid programs aren't very useful. – Caleb Jul 01 '13 at 19:03
  • 1
    @twalberg I agree there definitely is a sublanguage where the halting problem does not apply. However from Turing's proof we can deduce that such sublanguage is no longer turing-complete, so really not so interesting. – Fiktik Jul 01 '13 at 19:08
  • 1
    If compilers could do that kind of analysis about how a program will behave, we wouldn't need to execute our programs. (I'm sure that's a gross overgeneralization.) – Keith Thompson Jul 01 '13 at 21:01
  • I think this is a variation of the so-called "[halting problem](http://en.wikipedia.org/wiki/Halting_problem)". Given an arbitrary program in a sufficiently rich programming language it becomes impossible to determine if control ever flows through a particular point. Note that this is not simply "difficult", it's truly impossible. – Hot Licks Jul 01 '13 at 21:45
  • You seem to forget that this is not pure math problem - "The problem is to determine, given a program *and an input to the program*, whether the program will eventually halt when run with that input." When one says "While deciding whether these programs (such as "Hello world") halt is simple, more complex programs prove problematic." - this means that you go into the territory of mathematical logic. – SChepurin Jul 02 '13 at 09:31
  • @Caleb While a "correct" compiler should be able to compile any valid program, the subset of valid programs that are actually useful is very small (but still infinite, and with no possible way to characterize the set), so even a "broken" compiler that can only handle "useful" programs (whatever that means) is still of some use... Most compilers strive to be "correct", but due to bugs and other shortcomings never quite completely reach that goal... – twalberg Jul 02 '13 at 13:52
  • 1
    `void UserChange() { int* p; scanf("%p", &p); (*p)++; }` impossible to say what will be changed here at runtime. – Felix Dombek Jul 02 '13 at 16:04
  • That book's text is unclear which obfuscates the issue. It is trying to say, "let's get an infinite number of monkeys to write every conceivable C++ function which could ever be written. There will be cases where if we pick a variable that (some particular function the monkeys wrote) uses, we can't work out whether the function will change that variable." Of course for some (even many) functions in any given application, this can be determined by the compiler, and very easily. But not for all (or necessarily most). – El Zorko Jul 02 '13 at 22:35
  • For same reason you can't detect infinite loop, (can you notice there is a break key on keyboard, you have to press yourself if you feels you your code in infinite loop) – Grijesh Chauhan Jul 03 '13 at 09:04
  • Can someone explain why the last sentence of my question is wrong? Why is this check not enough? Here's the last question: The compiler can always check if its being reassigned, a non-const function is being invoked on it or if it is being passed in as a non-const parameter. Why is this check not sufficient to determine const-ness? – Cricketer Jul 21 '13 at 09:12

13 Answers13

140

Why is it impossible to build such a compiler?

For the same reason that you can't write a program that will determine whether any given program will terminate. This is known as the halting problem, and it's one of those things that's not computable.

To be clear, you can write a compiler that can determine that a function does change the variable in some cases, but you can't write one that reliably tells you that the function will or won't change the variable (or halt) for every possible function.

Here's an easy example:

void foo() {
    if (bar() == 0) this->a = 1;
}

How can a compiler determine, just from looking at that code, whether foo will ever change a? Whether it does or doesn't depends on conditions external to the function, namely the implementation of bar. There's more than that to the proof that the halting problem isn't computable, but it's already nicely explained at the linked Wikipedia article (and in every computation theory textbook), so I'll not attempt to explain it correctly here.

Caleb
  • 124,013
  • 19
  • 183
  • 272
  • 1
    Would this change with proper quantum computing algorithms? It sounds like a perfect application for quantum computers. – mrsoltys Jul 01 '13 at 18:49
  • 49
    @mrsoltys, quantum computers are "only" exponentially faster for some problems, they can not solve undecidable problems. – zch Jul 01 '13 at 18:52
  • 8
    @mrsoltys Those exponentially complicated algorithms (like factoring) is perfect for quantum computers, but halting problem is a logical dilemma, it's not computable no matter what kind of "computer" you have. – WSBT Jul 01 '13 at 18:56
  • 7
    @mrsoltys, just to be a smartass, yes, it would change. Unfortunately, it would mean the algorithm is both terminated and still running, unfortunately, you can't tell which without directly observing, by which you affect the actual state. – Nathan Ernst Jul 01 '13 at 19:11
  • 1
    Can you explain why is this problem the same as the halting problem? – David Brown Jul 01 '13 at 20:33
  • 2
    @DavidBrown, In order to decide if a function will return a given value, you have to determine if that function can end. In the example given, it's necessary to know if there exists a scenario in which bar() will return 0. This requires solving the Halting Problem. – Martin Epsz Jul 01 '13 at 20:37
  • 3
    I don't think you have to invoke the halting problem to explain this limitation. Isn't it sufficient just to say that a function could modify a variable or not depending run-time conditions, which are not known at compile time? E.g. `foo(int x) { if (x < 1) a = 3;` – LarsH Jul 01 '13 at 20:41
  • 2
    "... you can't write a program that will determine whether any given program will terminate" - without actually _executing_ the program that is. – Thorbjørn Ravn Andersen Jul 01 '13 at 21:06
  • 9
    @ThorbjørnRavnAndersen: O.K., so, suppose I'm executing a program. How exactly do I determine whether it will terminate? – ruakh Jul 01 '13 at 21:09
  • 8
    @ThorbjørnRavnAndersen But if you *actually* execute the program, and it doesn't terminate (e.g. an infinite loop), you will never find out that it doesn't terminate... you just keep on executing one more step, because it could be the last one... – MaxAxeHax Jul 01 '13 at 21:33
  • 2
    Good answer, @Caleb. It may be worth noting, however, that while it is impossible to write a compiler that can determine whether or not a function changes a variable for _all_ possible functions, it is possible to write a compiler that can determine that _certain_ functions will modify a varaible, or that they will never modify a variable. – Dave Lillethun Jul 01 '13 at 23:32
  • 4
    Basically the statement in the original question does _not_ mean that a compiler, when asked "Does this function modify X?" can only ever answer "I don't know." However, what it _does_ mean is that compilers can group functions into "yes", "no", and "I don't know" categories, but for the set of all possible funtion the "I don't know" category can never be empty; there will always be _some_ functions for which it cannot tell. – Dave Lillethun Jul 01 '13 at 23:34
  • Thanks, @DaveL. I tried to write as much -- I emphasized *in some cases*. See also my response to twalberg's comment on the question. – Caleb Jul 01 '13 at 23:46
  • @ruakh not the point. You cannot determine _IF_ it will terminate without running it until it terminates. – Thorbjørn Ravn Andersen Jul 01 '13 at 23:56
  • 5
    @ThorbjørnRavnAndersen: You also can't determine *IF* it will terminate *by* running it until it terminates, because -- what if it doesn't? In other words -- you can't write a program that will examine some other program and is guaranteed to print "Halts" if it halts and "Doesn't halt" if it doesn't. – ruakh Jul 02 '13 at 00:02
  • @BenJackson I agree, because even if you can prove a function halts, you still may not be able to prove it doesn't modify a variable. – David Brown Jul 02 '13 at 00:19
  • @DavidBrown The connection to the halting problem isn't that you can't decide that the function halts or not (though there's that too), it's that you can't decide *for all valid functions* whether a given function does *any* specific thing. The same argument (really not the one I make above) that proves that you can't decide about halting can also be used to show that you can't decide whether the function calls a function, changes a variable, etc. nightcracker's answer illustrates this nicely. – Caleb Jul 02 '13 at 01:37
  • @BenJackson Aliasing adds complexity, but the problem remains even if you remove aliasing. I chose to focus on the fact that the compiler has insufficient information because it's easy to understand, and on the halting problem because it seals the deal. But another answer covering aliasing would certainly be helpful. – Caleb Jul 02 '13 at 01:47
  • 2
    There appears to be some confusion between the Halting Problem - which means that there are some things that are *uncomputable* and questions of difficulty (e.g. NP-Hard problems) which are *impractical* to compute. The Halting Problem arises from a fundamental property of computing (and indeed of mathematics of which computing is a subset - see http://en.wikipedia.org/wiki/G%C3%B6del%27s_incompleteness_theorem) - this is a function of the program *and* the input - even knowing both you cannot determine if the program will halt. – Dale M Jul 02 '13 at 01:50
  • @Dale M - You may say that "Hello world" program will halt very quickly. This is a fundamental property of computing. – SChepurin Jul 02 '13 at 09:34
  • 1
    @SChepurin: If you make "Hello world" wait for external event to terminate, the it will be impossible to predict whether it will ever stop, because external event might not happen. – SigTerm Jul 02 '13 at 23:41
126

Imagine such compiler exists. Let's also assume that for convenience it provides a library function that returns 1 if the passed function modifies a given variable and 0 when the function doesn't. Then what should this program print?

int variable = 0;

void f() {
    if (modifies_variable(f, variable)) {
        /* do nothing */
    } else {
        /* modify variable */
        variable = 1;
    }
}

int main(int argc, char **argv) {
    if (modifies_variable(f, variable)) {
        printf("Modifies variable\n");
    } else {
        printf("Does not modify variable\n");
    }

    return 0;
}
orlp
  • 112,504
  • 36
  • 218
  • 315
  • 13
    Nice! The [I am a liar paradox](http://en.wikipedia.org/wiki/Liar_paradox) as written by a programmer. – Krumelur Jul 01 '13 at 20:07
  • 30
    It's actually just a nice adaption of the famous proof for undecidability of the [halting problem](http://en.wikipedia.org/wiki/Halting_problem#Sketch_of_proof). – Konstantin Weitz Jul 01 '13 at 21:04
  • 10
    In this concrete case "modifies_variable" should return true: There's at least one execution path in which the variable is indeed modified. And that execution path is reached after a call to an external, non-deterministic function - so whole function is non-deterministic. For these 2 reasons, the compiler should take the pesimistic view and decide it does modify the variable. If the path to modifying the variable is reached after a deterministic comparison (verifiable by the compiler) yields false (i.e. "1==1") then compiler could safely say such function never modifies variable – Joe Pineda Jul 01 '13 at 23:10
  • @JoePineda How is that the pessimistic view? If it was, say, current microsecond instead of an arbitrary integer, wouldn't the "pessimistic" view be "it's still the same"? You seem to be giving `modifies_variable()` a lot more knowledge than it actually has... – Izkata Jul 02 '13 at 01:34
  • 8
    @JoePineda: The question is whether `f` modifies the variable — not whether it could modify the variable. This answer is correct. – Neil G Jul 02 '13 at 05:27
  • 1
    what if modifies_variable works for all functions other than itself. – Samhan Salahuddin Jul 02 '13 at 07:51
  • @eddardstark: Wouldn't change the answer. It's called on `f`, here, after all. And no, you can't easily fix that by outlawing indirect calls as well. – MSalters Jul 02 '13 at 10:55
  • 2
    @KonstantinWeitz exactly. This problem is at least as hard as the halting problem since the compiler must be able to decide whether or not a program halts before a variable is modified to see if the variable is actually modified. – orlp Jul 02 '13 at 11:31
  • 2
    Firstly this isn't quite the problem, should be modifies_variable(f, variable). Secondly let's assume that modifies_variable() is entirely plausible when looking at a function from the outside. Even in that (impossible) case this would cause the compiler to infinite loop. I think this obfuscates rather than highlights the problem by introducing recursion which includes the test. A simpler example would simply be a function which determines its execution path through rand() or some other external non-statically-analysable factor, which is what the original book was trying to convey and failing. – El Zorko Jul 02 '13 at 22:31
  • @ElZorko You first point about `modifies_variable(f, variable)` is true, but I kept the "API" as simple as possible for clarity. Your second point I do not agree with. Note that the book said it's __provably__ impossible. For me it's clear that the book tries to convey the mathematical aspect of it and the relation to the halting problem/similar problems. And while using a non-statically-analysable factor - such as a geiger counter hooked up to the PC - would be a simpler and more easily understandable aspect of this problem it does not reach into the fundamental impossibility of this problem. – orlp Jul 02 '13 at 23:24
  • In terms of the mathematical aspect, it's theoretically possible to "solve" the halting problem given an (entirely deterministic) Turing machine and a deterministic program with access to a limited (as in finite) amount of memory. Arguments against this come down to time and scale rather than mathematical proof (ironically). The most accessible (and conveniently, central) elements of this problem for programmers remain the aspects relating to non-deterministic programs. – El Zorko Jul 03 '13 at 08:56
  • @nightcracker, great reduction of the "modify variable" to the halting problem. Very concrete approach! – rbrito Jul 03 '13 at 14:22
  • @NeilG to answer the question "does f modify the variable" we need to know what "modifies_variable" itself does. The compiler knows not its code, or if it's deterministic, and can't execute it at compile time to know what it'd yield. Imagine "modifies_variable" was really a function that queries the user to answer "y/n" and returns a 1 or 0 based on that. Function f would modify the variable, or not, based on an external, code-opaque, non-deterministic function. So any self-respecting compiler should fall to the safe side and assume it does, occasionally, modify it. – Joe Pineda Jul 03 '13 at 23:00
  • @Izkata no, rather the opposite: I'm making the compiler really ignorant on what modifies_variable is about on purpose, to highlight my point: deciding whether a variable may modified or not really depends on the function (and the functions it may call) being deterministic or not. If all possible execution paths are deterministic then for a constant input compiler could execute at compile time and _know_ for sure if variable will be changed or not. When you put a non-det. function call in the middle (or your input depends on an external source) all you know is you know nothing. – Joe Pineda Jul 03 '13 at 23:09
  • 6
    @JoePineda: nothing prevents me from copy/pasting the code of `modifies_variable` from the compiler source, totally nullifying your argument. (assuming open-source, but the point should be clear) – orlp Jul 04 '13 at 00:32
  • @nightcracker once you do that, then we do have the haltungproblem in disguise and the original answer does apply, but not before :) – Joe Pineda Jul 04 '13 at 21:44
  • How come [Gödel's Incompleteness Theorem](http://en.wikipedia.org/wiki/Gödel's_incompleteness_theorems) hasn't even been mentioned yet? :( – sehe Mar 26 '14 at 21:50
61

Don't confuse "will or will not modify a variable given these inputs" for "has an execution path which modifies a variable."

The former is called opaque predicate determination, and is trivially impossible to decide - aside from reduction from the halting problem, you could just point out the inputs might come from an unknown source (eg. the user). This is true of all languages, not just C++.

The latter statement, however, can be determined by looking at the parse tree, which is something that all optimizing compilers do. The reason they do is that pure functions (and referentially transparent functions, for some definition of referentially transparent) have all sorts of nice optimizations that can be applied, like being easily inlinable or having their values determined at compile-time; but to know if a function is pure, we need to know if it can ever modify a variable.

So, what appears to be a surprising statement about C++ is actually a trivial statement about all languages.

Community
  • 1
  • 1
BlueRaja - Danny Pflughoeft
  • 84,206
  • 33
  • 197
  • 283
28

I think the key word in "whether or not a C++ function will change the value of a particular variable" is "will". It is certainly possible to build a compiler that checks whether or not a C++ function is allowed to change the value of a particular variable, you cannot say with certainty that the change is going to happen:

void maybe(int& val) {
    cout << "Should I change value? [Y/N] >";
    string reply;
    cin >> reply;
    if (reply == "Y") {
        val = 42;
    }
}
Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • "It is certainly possible to build a compiler that checks whether or not a C++ function can change the value of a particular variable" No, it is not. See Caleb's answer. For a compiler to know if foo() can change a, it would have to know if it is possible for bar() to return 0. And there is no computable function that can tell all possible return values of any computable function. So there exist code paths such that the compiler won't be able to tell if they will ever be reached. If a variable is changed only in a code path that can't be reached it won't change, but a compiler won't detect it – Martin Epsz Jul 01 '13 at 18:30
  • 12
    @MartinEpsz By "can" I mean "is allowed to change", not "can possibly change". I believer that this is what OP had in mind when talking about `const`-ness checks. – Sergey Kalinichenko Jul 01 '13 at 18:33
  • @dasblinkenlight I would have to agree that I believe the OP may have meant the first one, "is allowed o change", or "may or may not change" vs. "will definitely not change". Of course I can't think of a scenario where this would be an issue. You could even modify the compiler to simply answer "may change" on any function containing either the identifier or a call to a function which has a "may change" answer attribute. That said, C and C++ are horrible languages to try this with, since they have such a loose definition of things. I think this is why const-ness would be an issue in C++ at all. – DDS Jul 02 '13 at 21:35
  • @MartinEpsz: "And there is no computable function that can tell all possible return values of any computable function". I think that checking "all possible return values" is an incorrect approach. There are mathematical systems (maxima, mathlab) that can solve equations, which means it would make sense to apply similar approach to functions. I.e. treat it as an equation with several unknowns. The problems are flow control + side effects => unsolvable situations. IMO, without those (functional language, no assignment/side effects), it would've possible to predict which path program will take – SigTerm Jul 02 '13 at 23:36
16

I don't think it's necessary to invoke the halting problem to explain that you can't algorithmically know at compile time whether a given function will modify a certain variable or not.

Instead, it's sufficient to point out that a function's behavior often depends on run-time conditions, which the compiler can't know about in advance. E.g.

int y;

int main(int argc, char *argv[]) {
   if (argc > 2) y++;
}

How could the compiler predict with certainty whether y will be modified?

LarsH
  • 27,481
  • 8
  • 94
  • 152
9

It can be done and compilers are doing it all the time for some functions, this is for instance a trivial optimisation for simple inline accessors or many pure functions.

What is impossible is to know it in the general case.

Whenever there is a system call or a function call coming from another module, or a call to a potentially overriden method, anything could happen, included hostile takeover from some hacker's use of a stack overflow to change an unrelated variable.

However you should use const, avoid globals, prefer references to pointers, avoid reusing variables for unrelated tasks, etc. that will makes the compiler's life easier when performing aggressive optimisations.

kriss
  • 23,497
  • 17
  • 97
  • 116
  • 1
    If I recall it correctly, that's the whole point of functional programming, right? By using only purely deterministic, no side-effects functions, compilers are free to do aggressive optimizations, pre-execution, post-execution, memoization and even execution at compile time. The point that I think a lot of the answerers are ignoring (or confused about) is that it _is_ indeed possible *for a well-behaved subset of all programs*. And no, this subset isn't trivial or uninteresting, actually it's very useful. But it's indeed impossible for the absolute general case. – Joe Pineda Jul 03 '13 at 23:19
  • Overloading is a compile-time concept. You probably meant "overridden method". – fredoverflow Mar 26 '14 at 22:04
  • @FredOverflow: yes, I mean overriden. Overloading is indeed a compile time concept. Thanks for spotting it (of course if the implementation comes from another compilation unit, the compiler can still have troubles analysing it, but that was not what I meant). I will fix the answer. – kriss Mar 27 '14 at 00:14
6

There are multiple avenues to explaining this, one of which is the Halting Problem:

In computability theory, the halting problem can be stated as follows: "Given a description of an arbitrary computer program, decide whether the program finishes running or continues to run forever". This is equivalent to the problem of deciding, given a program and an input, whether the program will eventually halt when run with that input, or will run forever.

Alan Turing proved in 1936 that a general algorithm to solve the halting problem for all possible program-input pairs cannot exist.

If I write a program that looks like this:

do tons of complex stuff
if (condition on result of complex stuff)
{
    change value of x
}
else
{
    do not change value of x
}

Does the value of x change? To determine this, you would first have to determine whether the do tons of complex stuff part causes the condition to fire - or even more basic, whether it halts. That's something the compiler can't do.

Community
  • 1
  • 1
Timothy Shields
  • 75,459
  • 18
  • 120
  • 173
6

Really surprised that there isn't an answer that using the halting problem directly! There's a very straightforward reduction from this problem to the halting problem.

Imagine that the compiler could tell whether or not a function changed the value of a variable. Then it would certainly be able to tell whether the following function changes the value of y or not, assuming that the value of x can be tracked in all the calls throughout the rest of the program:

foo(int x){
   if(x)
       y=1;
}

Now, for any program we like, let's rewrite it as:

int y;
main(){
    int x;
    ...
    run the program normally
    ...
    foo(x);
}

Notice that, if, and only if, our program changes the value of y, does it then terminate - foo() is the last thing it does before exiting. This means we've solved the halting problem!

What the above reduction shows us is that the problem of determining whether a variable's value changes is at least as hard as the halting problem. The halting problem is known to be incomputable, so this one must be also.

Mikaël Mayer
  • 10,425
  • 6
  • 64
  • 101
John Doucette
  • 4,370
  • 5
  • 37
  • 61
  • I'm not sure I follow your reasoning, about why our program terminates iff it changes the value of `y`. Looks to me like `foo()` returns quickly, and then `main()` exits. (Also, you're calling `foo()` without an argument... that's part of my confusion.) – LarsH Jul 02 '13 at 01:16
  • 1
    @LarsH: Iff the modified program terminates, the last function it called was f. If y was modified, f was called (the other statements can't change y, since it was only introduced by the modification). Hence, if y was modified, the program terminates. – MSalters Jul 02 '13 at 11:00
4

As soon as a function calls another function that the compiler doesn't "see" the source of, it either has to assume that the variable is changed, or things may well go wrong further below. For example, say we have this in "foo.cpp":

 void foo(int& x)
 {
    ifstream f("f.dat", ifstream::binary);
    f.read((char *)&x, sizeof(x));
 }

and we have this in "bar.cpp":

void bar(int& x)
{
  foo(x);
}

How can the compiler "know" that x is not changing (or IS changing, more appropriately) in bar?

I'm sure we can come up with something more complex, if this isn't complex enough.

Mats Petersson
  • 126,704
  • 14
  • 140
  • 227
  • The compiler can know that x is not changing in bar if bar x is passed as pass-by-reference-to-const, right? – Cricketer Jul 01 '13 at 17:30
  • Yes, but if I add a `const_cast` in foo, it would still make `x` change - I'd be in breach of the contract that says that you are not to change `const` variables, but since you can convert anything to "more const", and `const_cast` exists, the designers of the language surely had the idea in mind that sometimes there are good reasons to believe that `const` values may need changing. – Mats Petersson Jul 01 '13 at 17:34
  • @MatsPetersson: I believe that if you const_cast you get to keep all the pieces that break because the compiler may, but does not have to compensate for that. – Zan Lynx Jul 01 '13 at 17:45
  • @ZanLynx: Yes, I'm sure that's correct. But at the same time, the cast does exist, which means that someone who designed the language did have some sort of idea that "we may need this at some point" - which means it's not meant to not do anything useful at all. – Mats Petersson Jul 01 '13 at 17:47
1

It is impossible in general to for the compiler to determine if the variable will be changed, as have been pointed out.

When checking const-ness, the question of interest seems to be if the variable can be changed by a function. Even this is hard in languages that support pointers. You can't control what other code does with a pointer, it could even be read from an external source (though unlikely). In languages that restrict access to memory, these types of guarantees can be possible and allows for more aggressive optimization than C++ does.

Krumelur
  • 31,081
  • 7
  • 77
  • 119
  • 2
    One thing I wish was supported in languages would be a distinction between ephemeral, returnable, and persistable references (or pointers). Ephemeral references may only be copied to other ephemeral references, returnable ones may be copied to ephemeral or returnable ones, and persistable ones can be copied any which way. The return value of a function will be constrained by the most restrictive of the arguments that are passed as "returnable" parameters. I consider it unfortunate that in many languages, when one passes a reference there's nothing to indicate how long it may be used. – supercat Jul 01 '13 at 17:41
  • That would certainly be useful. There are of course patterns for this, but in C++ (and many other languages) it is always possible to "cheat". – Krumelur Jul 01 '13 at 17:45
  • A major way in which .NET is superior to Java is that it has a concept of an ephemeral reference, but unfortunately there is no way for objects to expose properties as ephemeral references (what I'd really like to see would be a means by which code using a property would pass an ephemeral reference to a code (along with temporary variables) that should be used to manipulate the object. – supercat Jul 01 '13 at 17:57
1

To make the question more specific I suggest the following set of constraints may have been what the author of the book may have had in mind:

  1. Assume the compiler is examining the behavior of a specific function with respect to const-ness of a variable. For correctness a compiler would have to assume (because of aliasing as explained below) if the function called another function the variable is changed, so assumption #1 only applies to code fragments that don't make function calls.
  2. Assume the variable isn't modified by an asynchronous or concurrent activity.
  3. Assume the compiler is only determining if the variable can be modified, not whether it will be modified. In other words the compiler is only performing static analysis.
  4. Assume the compiler is only considering correctly functioning code (not considering array overruns/underruns, bad pointers, etc.)

In the context of compiler design, I think assumptions 1,3,4 make perfect sense in the view of a compiler writer in the context of code gen correctness and/or code optimization. Assumption 2 makes sense in the absence of the volatile keyword. And these assumptions also focus the question enough to make judging a proposed answer much more definitive :-)

Given those assumptions, a key reason why const-ness can't be assumed is due to variable aliasing. The compiler can't know whether another variable points to the const variable. Aliasing could be due to another function in the same compilation unit, in which case the compiler could look across functions and use a call tree to statically determine that aliasing could occur. But if the aliasing is due to a library or other foreign code, then the compiler has no way to know upon function entry whether variables are aliased.

You could argue that if a variable/argument is marked const then it shouldn't be subject to change via aliasing, but for a compiler writer that's pretty risky. It can even be risky for a human programmer to declare a variable const as part of, say a large project where he doesn't know the behavior of the whole system, or the OS, or a library, to really know a variable won't change.

Χpẘ
  • 3,403
  • 1
  • 13
  • 22
0

Even if a variable is declared const, doesn't mean some badly written code can overwrite it.

//   g++ -o foo foo.cc

#include <iostream>
void const_func(const int&a, int* b)
{
   b[0] = 2;
   b[1] = 2;
}

int main() {
   int a = 1;
   int b = 3;

   std::cout << a << std::endl;
   const_func(a,&b);
   std::cout << a << std::endl;
}

output:

1
2
Mark Lakata
  • 19,989
  • 5
  • 106
  • 123
  • This happens because `a` and `b` are stack variables, and `b[1]` just happens to be the same memory location as `a`. – Mark Lakata Jul 01 '13 at 23:59
  • 1
    -1. Undefined Behavior removes all restrictions on the compiler's behavior. – MSalters Jul 02 '13 at 11:03
  • Unsure about the down vote. This is just an example that goes to the OP's original question about why can't a compiler figure out if something is truly `const` if everything is labelled `const`. It is because undefined behavior is a part of C/C++. I was trying to find a different way to answer his question rather than mention the halting problem or external human input. – Mark Lakata Jul 02 '13 at 18:44
0

To expand on my comments, that book's text is unclear which obfuscates the issue.

As I commented, that book is trying to say, "let's get an infinite number of monkeys to write every conceivable C++ function which could ever be written. There will be cases where if we pick a variable that (some particular function the monkeys wrote) uses, we can't work out whether the function will change that variable."

Of course for some (even many) functions in any given application, this can be determined by the compiler, and very easily. But not for all (or necessarily most).

This function can be easily so analysed:

static int global;

void foo()
{
}

"foo" clearly does not modify "global". It doesn't modify anything at all, and a compiler can work this out very easily.

This function cannot be so analysed:

static int global;

int foo()
{
    if ((rand() % 100) > 50)
    {
        global = 1;
    }
    return 1;

Since "foo"'s actions depends on a value which can change at runtime, it patently cannot be determined at compile time whether it will modify "global".

This whole concept is far simpler to understand than computer scientists make it out to be. If the function can do something different based on things can change at runtime, then you can't work out what it'll do until it runs, and each time it runs it may do something different. Whether it's provably impossible or not, it's obviously impossible.

El Zorko
  • 3,349
  • 2
  • 26
  • 34
  • what you say is true, but even for very simple programs for wich everything is known at compile time you won't be able to proove anything, not even that the program will stop. This is the halting problem. For instance you could write a program based on Hailstone Sequences http://en.wikipedia.org/wiki/Collatz_conjecture and make it return true if it converge to one. Compilers won't be able to do it (as it would overflow in many cases) and even mathematicians don't know if it's true or not. – kriss Jul 13 '13 at 22:04
  • If you mean "there are *some* very simple looking programs for which you cannot prove anything" I entirely agree. But Turing's classic Halting Problem proof relies essentially on a program itself being able to tell whether it halts in order to set up a contradiction. As this is mathematics not implementation. There are certainly programs it is entirely possible to statically determine at compile time whether a particular variable will be modified, and whether the program will halt. It may not be mathematically provable, but it's practically achievable in certain cases. – El Zorko Jul 14 '13 at 11:37