8

Ok, consider the following code:

private const int THRESHHOLD = 2;

static void Main(string[] args)
{
     string hello;

     if (THRESHHOLD > 1) return;
     Console.WriteLine(hello);        
}

Suprisingly this code does not throw a "Use of unassigned local variable 'hello'" compile time error. It simply gives a warning "Unreachable code detected".

Even if the code is unreachable, it is still a compile time error, I'd think the right thing to do is throw a compile time error. If I were to do the following:

private const int THRESHHOLD = 2;

static void Main(string[] args)
{
     string hello;

     if (THRESHHOLD > 1) return;
     hello.LMFAO();       
}

Sure enough, I get a "'string' does not contain a definition for 'LMFAO' and no extension method 'LMFAO' accepting a first argument of type 'string' could be found (are you missing a using directive or an assembly reference?)" compile time error.

Why isn't it the same with the use of an unassigned variable?

EDIT Changed the const variable so it is less distracting. I think many are missing the point of the question which is whay depending on which case, compile time errors take precedence over unreachable code.

InBetween
  • 32,319
  • 3
  • 50
  • 90
  • 6
    Just wondering: Do you ever want `TRUE` to evaluate to anything but `true`? – Brian Rasmussen Feb 23 '12 at 16:07
  • 2
    An unused variable wouldn't prevent your code from working, or compiling, which is why it's only a warning message. Trying to use a non-defined method would prevent your code from running, so is an error. I'm not sure I understand the issue. – AaronS Feb 23 '12 at 16:09
  • @BrianRasmussen Maybe he's coming from a language where the 'true' keyword is in all caps? – Servy Feb 23 '12 at 16:10
  • interesting other question which relates: http://stackoverflow.com/questions/636932/in-c-why-is-string-a-reference-type-that-behaves-like-a-value-type – Schwarzie2478 Feb 23 '12 at 16:11
  • @Brian Rasmussen: Its just an example and not the issue at all. Imagine if it makes you more comfortable that the `const` is some threshhold integer value you check against which can even if it never should change in the future. Besides, it's not about the possibility of changing. Its about why does a warning have precedence over a compile time error. string.LMFAO is never executed either but is does throw a compile time error. What is the difference? – InBetween Feb 23 '12 at 16:12
  • Actually, use of an unassigned local *is* typically a compiler error, unless it's in an unreachable code block. – James Michael Hare Feb 23 '12 at 16:13
  • If you want to keep the discussion more focused, remove that TRUE part, it distracts... – sinelaw Feb 23 '12 at 16:18
  • @AaronS: Then why is the second case a compile time error? I'm just curious why sometimes unreachable code takes precedence over compile time errors and in some cases it doesn't – InBetween Feb 23 '12 at 16:19
  • 2
    Because an unknown method (or variable) would be syntactically invalid, the only thing that unreachable code changes in this case is that it considers all unassigned variables assigned, because they must be assigned before they are *used*, and it can't be used in unreachable code. – James Michael Hare Feb 23 '12 at 16:24

3 Answers3

12

If you look in the C# language specification in section 5, it states that:

For an initially unassigned variable to be considered definitely assigned at a certain location, an assignment to the variable must occur in every possible execution path leading to that location.

And further in 5.3.3.1:

v is definitely assigned at the beginning of any unreachable statement.

Since non-reachable code is not a possible exeuction path, it's not necessary for it to be assigned to avoid an error.

As to your question why an unknown function is a compiler error in unreachable code and an unassigned variable isn't. You have to consider the standard above. The unreachable code doesn't mean it can't be syntactically valid. The code still has to be compilable, the only difference the unreachable code makes it that it considers all initially unassigned variables assigned at that point. But that doesn't mean you could inject something syntactically invalid like an undefined variable or method.

The error message for unassigned variables gives us a hint too in that it tells us an initially unassigned variable must be assigned before use, but because the code is unreachable, it isn't technically being used...

James Michael Hare
  • 37,767
  • 9
  • 73
  • 83
  • Yeah, the behavior is according to spec. I was more interested as to why reachability has precedence over definite assignment. Eric's example shows a case where you would want it to be so. Thanks for the answer. – InBetween Feb 23 '12 at 19:50
  • @InBetween: No worries, he is the guru :-) – James Michael Hare Feb 23 '12 at 19:52
12

James Michael Hare's answer gives the de jure explanation: the local variable is definitely assigned, because the code is unreachable and all local variables are definitely assigned in unreachable code. Put another way: the program is only an error if there is a way to observe the state of the uninitialized local variable. In your program there is no way to observe the local, and therefore it is not an error.

Now, I note that the compiler is not required to be infinitely clever. For example:

void M()
{
    int x = 0;
    int y;
    if (x + 0 == x) return;
    Console.WriteLine(y);
}

You know and I know that the last line of the method is unreachable, but the compiler does not know that because the reachability analyzer does not know that zero is the additive identity of integers. The compiler thinks the last line might be reachable, and so gives an error.

For more information on aspects of designing reachability and definite assignment analyzers in programming languages, see my articles on the subject:

http://blogs.msdn.com/b/ericlippert/archive/tags/reachability/

http://blogs.msdn.com/b/ericlippert/archive/tags/definite+assignment/

I note though that no one has answered the deeper question, which is why should the error be suppressed in unreachable code? As you note, we give other semantic analysis errors in unreachable code.

To consider the pros and cons of that decision, you have to think about why someone would have unreachable code in the first place. Either it is intentionally unreachable, or unintentionally unreachable.

If it is unintentionally unreachable then the program contains a bug. The warning already calls attention to the primary problem: the code is unreachable. There is something majorly wrong with the control flow of the method if there is unreachable code. Odds are good that the developer is going to have to make a serious change to the control flow of the method; any local variable analysis we do on the unreachable code is likely to be misleading noise. Let the developer fix the code so that everything is reachable, and then we'll do an analysis of the now-reachable code for control-flow-related errors.

If the unreachable code is unreachable because the developer intended it to be unreachable, then odds are good they are doing something like this:

// If we can Blah, then Frob. However, if we cannot Blah and we can Baz, then Foo.
void M()
{
    int y;
    // TODO: The Blah method has a bug and always throws right now; fix it later.
    if (false /* Blah(out y) */ )
    {
        Frob(y);
    }
    else if (Baz(out y))
    {
        Foo(y);
    }
}

Should Frob(y) be an error in this program?

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • 1
    Great stuff as always. Thanks for the answer Eric, and your example as to why the implemented behavior is desirable makes it all much clearer. – InBetween Feb 23 '12 at 19:46
1

When the compiler sees that code is unreachable, it will not generate code for it -- and hence, there is no issue.

If you take that return line out, then the last line becomes reachable again, the compiler will generate code for it, and will tell you that there is a problem with it.

Roy Dictus
  • 32,551
  • 8
  • 60
  • 76