38

I was doing some small programs in java. I know that if I write while(true); the program will freeze in this loop. If the code is like that:

Test 1:

public class While {
    public static void main(String[] args) {
        System.out.println("start");
        while (true);
        System.out.println("end");
    }
}

The compiler throws me the error:

Exception in thread "main" java.lang.Error: Unresolved compilation problem: 
    Unreachable code
    at While.main(While.java:6)

I didn't know that this error exists. But I got why it is thrown. Of course, line 6 was unreachable, causing a compilation problem. Then I tested this:

Test 2:

public class While {
    public static void main(String[] args) {
        System.out.println("start");
        a();
        b();
    }
    static void a() {
        while(true);
    }
    static void b() {
        System.out.println("end");
    }
}

For some reason the program ran normally (The console printed "start" and then froze). The compiler couldn't check inside of void a() and see that it isn't reachable. To be sure I tried:

Test 3:

public class While {
    public static void main(String[] args) {
        System.out.println("start");
        a();
        System.out.println("end");
    }
    static void a() {
        while(true);
    }
}

Same result as Test 2.

After some research I found this question. So, if the code inside the parentheses is a variable the compiler wouldn't throw the exception. That makes sense, but I don't think that the same applies to voids.

Q: So, why does the compiler just throw me the error at Test 1, if void b() (Test 2) and System.out.println("end"); (Test 3) isn't reachable?

EDIT: I tried Test 1 in C++:

#include <iostream>

using namespace std;

int main()
{
    cout << "start" << endl;
    while(true);
    cout << "end" << endl;
    return 0;
}

The compiler didn't throw any errors, then I got the same result as Test 2 and as Test 3. So I suppose this is a java thing?

Community
  • 1
  • 1
A Cat
  • 629
  • 7
  • 20
  • 11
    Java compiler does not "look into" other methods for it's *primitive* static analysis of reachable/non-reachable code. (Doing so would introduce many complications with virtual methods / polymorphism and separate compilation units.) – user2864740 Jun 09 '14 at 23:06
  • 2
    Excellently asked question. – Qix - MONICA WAS MISTREATED Jun 09 '14 at 23:06
  • 1
    @user2864740 is correct. Looking for reachable/non-reachable code across the entire program is equivalent to solving the halting problem (if memory serves), so it can't be done as well as most people would want. – awksp Jun 09 '14 at 23:11
  • 4
    @user3580294 *Providing full and perfectly accurate* reachability analysis across the entire program would be equivalent to solving the halting problem. However, there are plenty of ways to *look* and get incomplete but useful results back, using a more advanced heuristic. (In general, whenever *anyone* says a given compiler task is equivalent to solving the halting problem, they're wrong for this reason. "Halting problem" should be one of your insta-"no" trigger phrases.) – Alex Celeste Jun 10 '14 at 03:46
  • 1
    @Leushenko Well, I was more than a little vague, but I don't think I said that it is an impossible task, period. I just said that "it can't be done as well as most people would want," which leaves plenty of room open to allow for the compiler to get *some* information still. Thanks for the clarification though, especially for the italicized part. – awksp Jun 10 '14 at 03:50
  • @Leushenko: Beyond the trivial case, what's the point? You end up throwing compile time at a problem which you inherently cannot solve in anything but the most trivial case. – Phoshi Jun 10 '14 at 09:04
  • @Phoshi that point is that you *can* solve it in a huge number of nontrivial cases. Whole-program compilers do this sort of thing all the time. – Alex Celeste Jun 10 '14 at 09:19
  • @Leushenko: It's perfectly legal to mess with all kinds of things via reflection or bytecode patching that make that essentially impossible. Does `f()` halt? Who knows, it might not be the same `f()` when you actually get around to running it. Now, if you *do* that you have bigger problems, but the fact that it can be done does mean that compile-time errors on perfectly legal code is a problem. – Phoshi Jun 10 '14 at 09:57

4 Answers4

20

The language spec has an exact definition what the compiler should treat as unreachable code, see also https://stackoverflow.com/a/20922409/14955.

In particular, it does not care about if a method completes, and it does not look inside other methods.

It won't do more than that.

However, you could get static code analysis tools like FindBugs to get a "deeper" analysis (not sure if they detect the pattern you described, though, either, and, as has been pointed out by others, the halting problem in all generality cannot be algorithmically solved anyway, so one has to draw the line at some reasonably definition of "best effort").

Community
  • 1
  • 1
Thilo
  • 257,207
  • 101
  • 511
  • 656
12

In general, it is not possible to determine with absolute certainly whether or not something is reachable.

Why? It is the equivalent to the Halting Problem.

The halting problem asks:

Given a description of an arbitrary computer program, decide whether the program finishes running or continues to run forever.

This problem has been proven to be unsolvable.


Whether or not X piece of code is reachable is the same as saying whether the code before it will halt.

Because it is an unsolvable problem, the compiler (in Java or any other language) doesn't try very hard to solve it. If it happens to determine that it really is unreachable, then you get the warning. If not, it may or may not be reachable.

In Java, unreachable code is a compiler error. So in order to maintain compatibility, the language spec defines exactly "how hard" the compiler should try. (Which according to the other answers, is "don't go inside another function".)

In other languages (such as C++), the compiler may go further subject to optimizations. (Unreachable code may be detected after inlining the function and discovering that it never returns.)

Mysticial
  • 464,885
  • 45
  • 335
  • 332
  • 4
    "Because it is an unsolvable problem, the compiler doesn't try very hard to solve it". Well, it could try a little bit harder if computation cost was the only concern. But its hands are bound by the language spec. At most, it could issue some warnings (but not errors). – Thilo Jun 09 '14 at 23:14
  • 4
    The JLS rules are far more significant for this than the undecidability of the halting problem. The compiler **could** do some level of inter-procedure analysis, but the JLS excludes that. – Patricia Shanahan Jun 09 '14 at 23:14
  • 2
    @Thilo The question becomes at what point should the compiler stop trying to solve it. I guess in the case of Java, the language spec has decided that. – Mysticial Jun 09 '14 at 23:20
  • 4
    Right. And the spec has to be definite on this because we don't want incompatible compilers. – Thilo Jun 09 '14 at 23:22
  • 3
    @Thilo Oh, I didn't realize that unreachable code was a compiler *error* in Java. It's merely a *warning* in the other languages that I work with (C++/C#). Yeah, in that case I agree that it would need to be specified. – Mysticial Jun 09 '14 at 23:29
12

Unreachable code is a compile time error that simply says 'the flow of this program doesn't make sense; something will never ever be reached'.

Obviously your tests perform how they do due to an endless loop, but why does the first fail with a compile time error?

A while statement can complete normally if at least one of the following is true:

  • The while statement is reachable and the condition expression is not a constant expression (§15.28) with value true.

  • There is a reachable break statement that exits the while statement.

Alright, but what about methods invocations (such as a()) - why do tests 2 and 3 successfully compile?

  • An expression statement can complete normally if it is reachable.

Since a method invocation is considered an expression, they will always be reachable so long as nothing before it blocks its path of logical execution.


To better illustrate some reasoning behind this compilation mechanism, let's take an if statement, for example.

if(false)
   System.out.println("Hello!"); // Never executes

The above will be correct at compile time (though many IDE's will definitely whine!).

The Java 1.7 Specification talks about this:

The rationale for this differing treatment is to allow programmers to define "flag variables" such as:

static final boolean DEBUG = false;

and then write code such as:

if (DEBUG) { x=3; }

The idea is that it should be possible to change the value of DEBUG from false to true or from true to false and then compile the code correctly with no other changes to the program text.

Further, there is actually a backwards compatibility reason as well:

This ability to "conditionally compile" has a significant impact on, and relationship to, binary compatibility (§13). If a set of classes that use such a "flag" variable are compiled and conditional code is omitted, it does not suffice later to distribute just a new version of the class or interface that contains the definition of the flag. A change to the value of a flag is, therefore, not binary compatible with pre-existing binaries (§13.4.9). (There are other reasons for such incompatibility as well, such as the use of constants in case labels in switch statements; see §13.4.9.)


Most (per the spec), if not all, implementations of the Java compiler do not traverse into methods. When parsing your Java code itself, it sees a() as just a MethodInvocationElement, meaning 'This code calls other code. I really don't care, I'm just looking at syntax.'. Syntactically, it makes sense for subsequent code to belong after a call to a().

Keep in mind performance costs. Compilation already takes a considerable amount of time. In order to keep things quick, the Java compiler doesn't actually recurse into methods; that would take ages (the compiler would have to evaluate many, many paths of code -- in theory).


To further reiterate that it's syntactically driven is to add a return; statement directly after your loop in a(). Doesn't compile, does it? Syntactically, though, it makes sense without it.

Qix - MONICA WAS MISTREATED
  • 14,451
  • 16
  • 82
  • 145
6

The answer lies in the rules set out for reachability by the Java Language Specification. It first states

It is a compile-time error if a statement cannot be executed because it is unreachable.

And then

A while statement can complete normally iff at least one of the following is true:

  • The while statement is reachable and the condition expression is not a constant expression (§15.28) with value true.
  • There is a reachable break statement that exits the while statement.

and

An expression statement can complete normally iff it is reachable.

In your first example, you have a while loop that cannot complete normally because it has a condition which is a constant expression with value true and there is no reachable break within it.

In your second and third examples, the expression statement (method invocation) is reachable and can therefore complete normally.


So I suppose this is a java thing?

The rules above are Java's rules. C++ probably has its own rules, as do other languages.

Sotirios Delimanolis
  • 274,122
  • 60
  • 696
  • 724
  • 1
    The only part of your answer that actually addresses his question is the last statement. – Qix - MONICA WAS MISTREATED Jun 09 '14 at 23:26
  • 1
    @Qix Does it address it completely? Or is there something missing? – Sotirios Delimanolis Jun 09 '14 at 23:27
  • 1
    If you took out the irrelevant parts this could be considered a link only answer. Just throwing that out there. – Qix - MONICA WAS MISTREATED Jun 09 '14 at 23:28
  • 2
    @Qix The question `So, why do the compiler just throws me the error at Test 1, if void b() (Test 2) and System.out.println("end"); (Test 3) isn't reachable?` The links and quotes explain the _why_. The before-last paragraph explains how it applies to the `while` in Test 1, and the last paragraph explains how it applies to the Tests 2 and 3. Is there something I missed? What is irrelevant? – Sotirios Delimanolis Jun 09 '14 at 23:30