45

I'm familiar with the C# specification, section 5.3 which says that a variable has to be assigned before use.

In C and unmanaged C++ this makes sense as the stack isn't cleared and the memory location used for a pointer could be anywhere (leading to a hard-to-track-down bug).

But I am under the impression that there are not truly "unassigned" values allowed by the runtime. In particular that a reference type that is not initialized will always have a null value, never the value left over from a previous invocation of the method or random value.

Is this correct, or have I been mistakenly assuming that a check for null is sufficient all these years? Can you have truly unintialized variables in C#, or does the CLR take care of this and there's always some value set?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
jmoreno
  • 12,752
  • 4
  • 60
  • 91

6 Answers6

70

I am under the impression that there are not truly "unassigned" values allowed by the runtime. In particular that a reference type that is not initialized will always have a null value, never the value left over from a previous invocation of the method or random value. Is this correct?

I note that no one has actually answered your question yet.

The answer to the question you actually asked is "sorta".

As others have noted, some variables (array elements, fields, and so on) are classified as being automatically "initially assigned" to their default value (which is null for reference types, zero for numeric types, false for bools, and the natural recursion for user-defined structs).

Some variables are not classified as initially assigned; local variables in particular are not initially assigned. They must be classified by the compiler as "definitely assigned" at all points where their values are used.

Your question then is actually "is a local variable that is classified as not definitely assigned actually initially assigned the same way that a field would be?" And the answer to that question is yes, in practice, the runtime initially assigns all locals.

This has several nice properties. First, you can observe them in the debugger to be in their default state before their first assignment. Second, there is no chance that the garbage collector will be tricked into dereferencing a bad pointer just because there was garbage left on the stack that is now being treated as a managed reference. And so on.

The runtime is permitted to leave the initial state of locals as whatever garbage happened to be there if it can do so safely. But as an implementation detail, it does not ever choose to do so. It zeros out the memory for a local variable aggressively.

The reason then for the rule that locals must be definitely assigned before they are used is not to prevent you from observing the garbage uninitialized state of the local. That is already unobservable because the CLR aggressively clears locals to their default values, the same as it does for fields and array elements. The reason this is illegal in C# is because using an unassigned local has high likelihood of being a bug. We simply make it illegal, and then the compiler prevents you from ever having such a bug.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • I would imagine that as well as this reason (and it's a good reason, ever see a CS0165 that wasn't a bug or at least brittle and unclear?) the fact that C# is low on undefined but allowed behaviours gives you more freedom to think about such implementation details without worrying about code written to use "it's undefined but we all know it does..." behaviour, breaking. – Jon Hanna Jan 20 '12 at 10:05
  • @JonHanna: Sure the pattern "declaration, conditional assignment, test for value". It's used for field's all the time. No reason why it should be any more fragile or error prone as a local than for a field. Of course if the runtime doesn't enforce clearing locals, then either the compiler should do so or otherwise try to prevent it being buggy. In this case, apparently the runtime clears them, but isn't obligated to do so... – jmoreno Jan 20 '12 at 21:26
  • @jmoreno Conceptually, the runtime doesn't clear them. Eric's answer above tells us that it does really, but while that's interesting it is't practical information to callers (I can't even think of a way to make use of it in an optimisation, can you?). That conditional assignment isn't brittle is that we generally don't have fully conditional assignments. E.g. any `out` call must set the value to something unless it throws an exception. This rule is strictly a waste - if we received false then we shouldn't use the value - but removes the brittleness that would exist otherwise. – Jon Hanna Jan 21 '12 at 02:37
  • @JonHanna: given that the compiler can't rely upon it, I would be hesitant to use it for an optimization, but if it was reliable (while the two products are conceptually distinct, they are from the same company so it might be deemed reliable) there is at least one obvious minor optimization: ignoring the first assignment to variables. As for `out` different situation, as you aren't required to use it in functions that return true or false. It's sugar but not a waste, it's usefull for exactly what it does, ensuring that all code paths set a value. If you didn't want that, you wouldn't use it. – jmoreno Jan 21 '12 at 05:14
  • @jmoreno my point is that I can't think of a way of turning this into any possible use, optimisation, perverse pessimisation, or anything. The return-bool-and-out pattern I mentioned only because the closest I could think of of somewhere where we get a lack of meaningful assignment, but of course, there's still an assignment so there's no brittleness. – Jon Hanna Jan 22 '12 at 21:56
  • @ericlippert "using an unassigned local has high likelihood of being a bug". Isn't this true for type fields as well? Why does c# allow the use of an unassigned field then? – Raikol Amaro May 10 '19 at 00:42
  • Start by reading the definite assignment algorithm specification. After that, describe for me an algorithm that works the same but on fields. If you can't describe such an algorithm then we have no reason to believe that the compiler team knows one! – Eric Lippert May 10 '19 at 00:53
  • If there is no algorithm to detect unassigned fields then obviously there cannot be an implementation of it in the compiler. – Eric Lippert May 10 '19 at 00:54
9

As far as I'm aware, every type has a designated default value.

As per this document, fields of classes are assigned the default value.

http://msdn.microsoft.com/en-us/library/aa645756(v=vs.71).aspx

This document says that the following always have default values assigned automatically.

  • Static variables.
  • Instance variables of class instances.
  • Instance variables of initially assigned struct variables.
  • Array elements.
  • Value parameters.
  • Reference parameters.
  • Variables declared in a catch clause or a foreach statement.

http://msdn.microsoft.com/en-us/library/aa691173(v=vs.71).aspx

More information on the actual default values here: Default values of C# types (C# reference)

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Joe
  • 46,419
  • 33
  • 155
  • 245
5

It depends on where the variable is declared. Variables declared within a class are automatically initialized using the default value.

object o;
void Method()
{
    if (o == null)
    {
        // This will execute
    }
}

Variables declared within a method are not initialized, but when the variable is first used the compiler checks to make sure that it was initialized, so the code will not compile.

void Method()
{
    object o;
    if (o == null) // Compile error on this line
    {
    }
}
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Tim B
  • 2,340
  • 15
  • 21
3

In particular that a reference type that is not initialized will always have a null value

I think you are mixing up local variables and member variables. Section 5.3 talks specifically about local variables. Unlike member variables that do get defaulted, local variables never default to the null value or anything else: they simply must be assigned before they are first read. Section 5.3 explains the rules that the compiler uses to determine if a local variable has been assigned or not.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
1

There are 3 ways that a variable can be assigned an initial value:

  1. By default -- this happens (for example) if you declare a class variable without assigning an initial value, so the initial value gets default(type) where type is whatever type you declare the variable to be.

  2. With an initializer -- this happens when you declare a variable with an initial value, as in int i = 12;

  3. Any point before its value is retrieved -- this happens (for example) if you have a local variable with no initial value. The compiler ensures that you have no reachable code paths that will read the value of the variable before it is assigned.

At no point will the compiler allow you to read the value of a variable that hasn't been initialized, so you never have to worry about what would happen if you tried.

Gabe
  • 84,912
  • 12
  • 139
  • 238
-1

All primitive data types have default values, so there isn't any need to worry about them.

All reference types are initialized to null values, so if you leave your reference types uninitialized and then call some method or property on that null ref type, you would get a runtime exception which would need to be handled gracefully.

Again, all Nullable types need to be checked for null or default value if they are not initialized as follows:

    int? num = null;
    if (num.HasValue == true)
    {
        System.Console.WriteLine("num = " + num.Value);
    }
    else
    {
        System.Console.WriteLine("num = Null");
    }

    //y is set to zero
    int y = num.GetValueOrDefault();

    // num.Value throws an InvalidOperationException if num.HasValue is false
    try
    {
        y = num.Value;
    }
    catch (System.InvalidOperationException e)
    {
        System.Console.WriteLine(e.Message);
    }

But, you will not get any compile error if you leave all your variables uninitialized as the compiler won't complain. It's only the run-time you need to worry about.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
S2S2
  • 8,322
  • 5
  • 37
  • 65