5

Ran into this as part of some EF/DB code today and ashamed to say I'd never encountered it before.

In .NET you can explicitly cast between types. e.g.

int x = 5;
long y = (long)x;

And you can box to object and unbox back to that original type

int x = 5; 
object y = x;
int z = (int)y;

But you can't unbox directly to a type that you can explicitly cast to

int x = 5;
object y = x;
long z = (long)y;

This is actually documented behaviour, although I never actually run into it until today. https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/types/boxing-and-unboxing#example-1

For the unboxing of value types to succeed at run time, the item being unboxed must be a reference to an object that was previously created by boxing an instance of that value type. Attempting to unbox null causes a NullReferenceException. Attempting to unbox a reference to an incompatible value type causes an InvalidCastException.

I'm curious is there some technical reason why this isn't possible/supported by the runtime?

Eoin Campbell
  • 43,500
  • 17
  • 101
  • 157
  • Unboxing is *explicit*. You're trying to unbox directly to a `long`, to make this work you would need to unbox to `int` then cast to `long` like this: `(long)(int)y` – DavidG Sep 04 '19 at 14:39
  • Curiously, this is supported: `MyEnum e = MyEnum.Value1; object o = e; int i = (int)o;` – Mr Anderson Sep 04 '19 at 14:39
  • 2
    @MrAnderson because the underlying type of the enum is `int`. – René Vogt Sep 04 '19 at 14:39
  • Basically the C# designers deemed the ability to unbox and convert in one go to not be useful enough to implement. – juharr Sep 04 '19 at 14:40
  • 6
    https://ericlippert.com/2009/03/03/representation-and-identity/ – Mr Moose Sep 04 '19 at 14:40
  • https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/types/boxing-and-unboxing#unboxing – DavidG Sep 04 '19 at 14:40
  • 1
    In your example, these are not implicit casts, but explicit casts... You are explicitly telling the compiler the destination type. You can unbox the int and implicitly cast it to long, that works perfectly fine `int x = 5; object y = x; long z = (int)y;` – adjan Sep 04 '19 at 14:41
  • 2
    That's extra work for the people who wrote the compiler to make it figure out that to convert from a boxed int to a a long you need to first unbox then widen. [There is no cheap feature](https://blogs.msdn.microsoft.com/ericlippert/2003/10/28/how-many-microsoft-employees-does-it-take-to-change-a-lightbulb/). – Sweeper Sep 04 '19 at 14:41
  • 4
    For this to work, the compiler needed to emit code that checks the type of the boxed value at run-time and then deciding what to do: `int` to `int` -> no problem, `int` to `long` -> widen the value, `int` to `double` -> change all the bits....and that would then need to happen for every single cast you do in your code. I guess that's a lot extra code and a performance hit compared to the little extra work for the developer in the rare cases where you need an extra cast. – René Vogt Sep 04 '19 at 14:49
  • @DavidG/@Adrian yep, mixed up my implicit/explicits :-) fixed in the post, but my question still stands. As others pointed out I've no features are free (i've read the ericlippert/ms lightbulb post before) but fixued this would be something that would be usable, and was curious if it was simply that, or if there was some other genuine technical reason. – Eoin Campbell Sep 04 '19 at 14:49
  • @MrMoose :-) thank you. something that directly addresses the issue. If you want to post that as the answer, I'll give you the checkmark :-) – Eoin Campbell Sep 04 '19 at 14:50
  • 4
    Did you properly re-read Eric's article? "This would be a huge amount of code to generate, and it would be very slow. The code is of course so large that you would want to put it in its own method and just generate a call to it... it’s available – you can always call `Convert.ToInt32`, which does all that analysis at runtime for you. We give you the choice between “fast and precise” or “slow and lax”, and the sensible default is the former. If you want the latter then call the method." – Damien_The_Unbeliever Sep 04 '19 at 14:53
  • 1
    There are 2 separate EricLippert posts above. One posted by sweeper re: the cost of implementing features and why certain features get left out (this is the one I read) and the other one by MrMoose which addresses this specific topic. – Eoin Campbell Sep 04 '19 at 14:58
  • https://stackoverflow.com/a/3953684/17034 – Hans Passant Sep 04 '19 at 15:08

2 Answers2

1

The cast syntax (Foo)bar C# does one of these things:

  • cast a reference type to another reference type
  • convert a value type to another value type
  • box a value type
  • unbox a boxed value type

These operations are semantically very different. It really makes more sense to think of them as four distinct operations which by historical accident happen to share the same (Foo)bar syntax. In particular they have different constraints on what information need to be known at compile time:

  • an unboxing operation need to know the type of the unboxed value
  • a value type conversion need to know both the source and target types.

The is basically because the compiler needs to know at compile time how many bytes to allocate to the values. In your example, the information that the boxed value is an int is not available at compile time, which means neither the unboxing nor the conversion to a long can be compiled.

What is counter-intuitive here is that the same constraints does not apply to reference types. Indeed the whole point of casting reference types is that the compiler don't know the exact type at compile time. You use a cast when you know better then the compiler, and the compiler accepts that, and then at runtime performs a type check to ensure that cast is valid.

This is possible due to to some fundamental differences in reference types:

  • A reference type instance knows its own exact type at runtime. It is stored as part of the instance data.
  • Reference types are polymorphic, which means the compiler does not need to know the exact instance type. All references have the same size, so there is no ambiguity about how many bytes to allocate.

These semantic differences between the different kinds of casts means they cannot be merged without compromising safety.

Lets say C# supported unbox-and-convert in a single cast expression:

int x = 70000;
object y = x;
short z = (short)y;

Currently an unboxing cast indicates that you expect that the boxed value is of the given type. If this is not the case, an exception is thrown, so you discover the bug. But a value-type conversion using cast syntax indicates that you know the types are different and that the conversion may lead to data loss.

If the language would automatically unbox and convert then there would be no way to express if you wanted a safe unboxing without any risk of data loss.

JacquesB
  • 41,662
  • 13
  • 71
  • 86
0

I don't know that I can summarise all of Eric Lippert's article on this issue succinctly, but a section of the article that I find relevant to your question specificially is the following;

"[there are] ... certain conversions that the C# compiler thinks of as representation-changing are actually seen by the CLR verifier as representation-preserving. For example, the conversion from int to uint is seen by the CLR as representation-preserving because the 32 bits of a signed integer can be reinterpreted as an unsigned integer without changing the bits. These cases can be subtle and complex, and often have an impact on covariance-related issues.

I’m also ignoring conversions involving generic type parameters which are not known at compile time to be reference or value types. There are special rules for classifying those which would be major digressions to get into.

Anyway, we can think of representation-preserving conversions on reference types as those conversions which preserve the identity of the object. When you cast a B to a D, you’re not doing anything to the existing object; you’re merely verifying that it is actually the type you say it is, and moving on. The identity of the object and the bits which represent the reference stay the same. But when you cast an int to a double, the resulting bits are very different."

Mr Moose
  • 5,946
  • 7
  • 34
  • 69
  • I would *personally* have quoted some of the section where he *specifically* addresses the issue of unboxing to mismatched types. Surely that goes more directly to the OPs question? – Damien_The_Unbeliever Sep 04 '19 at 17:00
  • No worries Eric. I know what you're saying Damien, and I'm not sure the excerpt I captured is representative of the "technical reason" as to why the unboxing to a type you can cast to is unsupported. However, if you feel there is a better (and hopefully more concise) answer, feel free to add it. I think there is enough information in the comments and answers here to guide future readers though. – Mr Moose Sep 04 '19 at 17:43