Why Animals[] animals = new Cat[5] compiles, but List animals = new List() does not?

Question

In his book C# in Depth, Jon Skeet tries to answer the following question:

Why can't I convert List<string> to List<object>?

To explain it, he started with a code-snippet, which includes these two lines:

Animal[] animals = new Cat[5]; //Ok. Compiles fine!            
List<Animal> animals = new List<Cat>(); //Compilation error!

As the comments read, the first one compiles fine, but the second one gives compilation error. I didn't really understand the reason. Jon Skeet explained this with saying only that first one compiles because in .NET, arrays are covariant, and second one doesn't compile because generics are not covariant (they're invariant, instead). And furthermore, arrays are covariant in .NET, because arrays are covariant in Java and .NET made it similar to Java.

I'm not completely statisfied with this short-answer. I want to know it in a more detail, with some insight into how compiler handles the difference, and how it generates IL and all.

Also, if I write (taken from the book itself):

Animal[] animals = new Cat[5]; //Ok. Compiles fine!
animals.Add(new Turtle()); //this too compiles fine!

It compiles fine, but fails at runtime. If it has to fail at runtime(which means what I write shouldn't make sense), then why does it compile in the first place? Can I use the instance animals in my code, and which also runs with no runtime error?

It's not obvious which bit is unclear to you... it's just a property of the CLI, that reference type array covariance *is* supported, even though it's unsafe (in that array store operations have to be checked for safety, individually). I'm happy to go into more details of specific parts, but at the moment it's hard to know where to start. — Jon Skeet, Sep 28 '11 at 19:04
What part of Jon's answer didn't you understand? Is it that you don't understand what invariant or covariant means? Basically, please be a little more specific in the question you are trying to ask. — NotMe, Sep 28 '11 at 19:05
@Nawaz: And the answer is simply "because the CLR supports variance for reference type arrays, but not for arbitrary generic types". (Appropriate variance is supported for delegates and interfaces, of course.) — Jon Skeet, Sep 28 '11 at 19:10
I think I'd have lived a happier life if I didn't know C# allowed that construct... — Blindy, Sep 28 '11 at 19:15

score 10 · Answer 1 · edited Jun 20 '20 at 09:12

Arrays have a weird history with variance in .NET. Proper support for variance was added in the 2.0 version of the CLR - and the language in C# 4.0. Arrays however, have always had a covariant behavior.

Eric Lippert goes into great detail on that in a blog post.

The interesting bit:

Ever since C# 1.0, arrays where the element type is a reference type are covariant. This is perfectly legal:

Animal[] animals = new Giraffe[10];

Since Giraffe is smaller than Animal, and “make an array of” is a covariant operation on types, Giraffe[] is smaller than Animal[], so an instance fits into that variable.

Unfortunately, this particular kind of covariance is broken. It was added to the CLR because Java requires it and the CLR designers wanted to be able to support Java-like languages. We then up and added it to C# because it was in the CLR. This decision was quite controversial at the time and I am not very happy about it, but there’s nothing we can do about it now.

Emphasis added by myself.

score 7 · Answer 2 · answered Sep 28 '11 at 19:54

7

If it has to fail at runtime, then why does it compile in the first place?

That's precisely why array covariance is broken. The fact that we allow it means that we allow what should be an error that gets caught at compilation time to be ignored, and instead get caught at runtime.

I'm not completely statisfied with this short-answer. I want to know it in a more detail, with some insight into how compiler handles the difference ...

The compiler handles the difference easily enough. The compiler has a whole bunch of code that determines when one type is compatible with another. Part of that code deals with array-to-array conversions. Part of that code deals with generic-to-generic conversions. The code is a straightforward translation of the relevant lines of the specification.

... and how it generates IL and all.

There is no need to generate any IL whatsoever for a covariant array conversion. Why would we need to generate IL for a conversion between two reference-compatible types? It's like asking what IL we generate for converting a string to an object. A string already is an object, so there's no code generated.

answered Sep 28 '11 at 19:54

Eric Lippert

647,829
179
1,238
2,067

Ok I'm confused about that last bit and maybe this should be another question, but I thought the output of the compiler was IL. Unless you're saying that the compiler simply doesn't do anything *special* when the setting a variable that is assignment compatible with the value. – Conrad Frix Sep 28 '11 at 20:05
1

Out of interest Eric, do you happen to know if the JIT compiler is smart enough to avoid the execution-time type check on storing an element in a reference type array when the "known" type of the array is sealed? For example, writing "foo" via a `object[]` reference needs checking because it could be a `Button[]`, but writing "foo" via a `string[]` reference *must* be valid (in terms of the type), because it couldn't refer to a value of any type other than an actual string array. – Jon Skeet Sep 28 '11 at 20:05
1

@ConradFrix: Right, we just generate IL that assigns the variable, assuming that you are in fact assigning a variable. If the value is of a type that is *assignment compatible* with the type of the variable then by definition, you don't need to generate any conversion code. That's what "assignment compatible" means. The question was not "what code is generated on an assignment" but rather "what code is generated on a conversion". No code is generated on a conversion from `string[]` to `object[]`, just as no code is generated on a conversion from `string` to `object`. – Eric Lippert Sep 28 '11 at 21:18
@JonSkeet: I do not know for sure but I would definitely be very surprised if the jitter generated a type check when the type was statically known to be sealed. – Eric Lippert Sep 28 '11 at 21:18
Have you considered deprecating array covariance or at least adding a warning in future versions? – configurator Sep 29 '11 at 15:33
@configurator: I personally would love to, but I suspect that it would probably break too much existing code to eliminate it entirely. Adding a warning is not a bad idea. – Eric Lippert Sep 29 '11 at 15:39
I don't know how much code it would break. Perhaps it could be a compiler flag "use unsafe array covariance" for old projects that have been upgraded. I know, flags are evil, but if it's mentioned in the compiler error e.g. "Error: You can't do that! You have to set that flag first" it isn't so bad. I've never personally seen .Net 2+ code use array covariance which is why I suspect it wouldn't break too many projects - and those that it does break could always keep the unsafe functionality. – configurator Sep 29 '11 at 17:09
@configurator: Right. The real problem though is that the code which gets the warning -- the code which actually converts the `string[]` to `object[]` is not the code that is going to crash and die horribly later. That could be code written by someone else entirely, in an different assembly, when it attempts to write into an array of the wrong type. Also, though adding a warning might prevent someone from doing this dangerous-to-others thing, it does not in any way mitigate the runtime cost of type checking imposed on every array write to an array of unsealed ref type. – Eric Lippert Sep 29 '11 at 17:25
Perhaps DLLs compiled with no unsafe array casts could be marked in some way that the jitter would recognize, and it would avoid those type casts iff all currently loaded DLLs have that flag? – configurator Sep 29 '11 at 17:33
The array-covariance issue could best be solved, IMHO, by creating an abstract ReadableArray base type, from which both Array and ImmutableArray would derive. ImmutableArray could have a few constructors or factory methods to supply the data that would remain in the array forevermore. ReadableArray and ImmutableArray could both support covariance without difficulty. A ReadonlyArrayWrapper type would be nice as well, but would probably impair runtime performance and thus might not be worthwhile. – supercat Sep 29 '11 at 23:48
If one wants to allow array sorting in a type-safe way, one could add a covariant RearrangeableArray between ReadableArray and Array. A RearrangeableArray would not allow values to be stored into elements, but would include methods to rearrange elements (e.g. SwapAt(loc1, loc2)). Thus, even though the array would be mutable, there would be no problem passing an Array to a routine expecting a RearrangeableArray. – supercat Sep 29 '11 at 23:57

score 0 · Answer 3 · answered Sep 28 '11 at 19:16

I think Jon Skeet explained it rather well, however if you need that "Ahah" moment consider how generics work.

A generic class like List<> is, for most purposes, treated externally as a normal class. e.g. when you say List<string>() the compiler says ListString() (which contains strings) and the compiler can't be smart enough to convert a ListString to a ListObject by casting the items of its internal collection.

From reading MSDN blog post, it appears that covariance and contravariance is supported in .NET 4.0 when when working with delegates and interfaces. It mentions Eric's articles also in the second sentence.

Why Animals[] animals = new Cat[5] compiles, but List animals = new List() does not?

3 Answers3

Linked