How is ValueType.GetType() able to determine the type of the struct?

Question

For a reference type, the object's memory layout is

| Type Object pointer|
|    Sync Block      |
|  Instance fields...|

For a value type, the object layout seems to be

|  Instance fields...|

For a reference type, GetType means find the object from the 'Type Object pointer'. All objects of a given reference type object point to the same type object (which also has the method table)

For a value type, this pointer isn't available. So how does GetType() work ?

I checked with Google and I found this snippet.. which is a bit hazy. Can someone elaborate?

The solution is that the location in which a value is stored may only store values of a certain type. This is guaranteed by the verifier. Source

+1 This is a *very* good question, by the way. – Andrew Hare May 29 '09 at 17:16 — Andrew Hare, May 29 '09 at 17:16

Andrew Hare · Accepted Answer · 2009-05-29T17:15:28.497

25

Calling GetType() on a value type boxes that value type. By moving the value type onto the heap you now have a reference type which now has a pointer to the type of that object.

If you wish to avoid boxing you can call GetTypeCode which returns an enumeration that indicates the type of the value type without boxing it.

Here is an example showing the boxing that takes place:

C#:

class Program
{
    static void Main()
    {
        34.GetType();
    }
}

IL for Main():

.method private hidebysig static void Main() cil managed
{
        .entrypoint
        .maxstack 8
        L_0000: ldc.i4.s 0x22
        L_0002: box int32
        L_0007: call instance class [mscorlib]System.Type [mscorlib]System.Object::GetType()
        L_000c: pop 
        L_000d: ret 
}

Edit: To show what the compiler is doing, lets change the type of the literal like this:

class Program
{
    static void Main()
    {
        34L.GetType();
    }
}

By adding the "L" after the literal I am telling the compiler that I want this literal to be converted to a System.Int64. The compiler sees this and when it emits the box instruction it looks like this:

.method private hidebysig static void Main() cil managed
{
        .entrypoint
        .maxstack 8
        L_0000: ldc.i4.s 0x22
        L_0002: conv.i8 
        L_0003: box int64
        L_0008: call instance class [mscorlib]System.Type [mscorlib]System.Object::GetType()
        L_000d: pop 
        L_000e: ret 
}

As you can see, the compiler has done the hard work of determining the correct instructions to emit, after that it is up to the CLR to execute them.

edited May 29 '09 at 17:15

answered May 29 '09 at 14:49

Andrew Hare

344,730
71
640
635

1

But how does it know how to correctly set the Type Object pointer for the boxed object? All value types would look like a byte stream. e.g. a structA with 1 int followed by another structB with 2 ints would look like 3 ints. The struct variable points to the beginning of the first instance field.. so how does it determine the Type information? – Gishu May 29 '09 at 14:55
1

Please see the example I posted - notice that the "box" instruction is given the type that is being boxed (provided by the compiler's support for implicit boxing). This type is used to set the correct object type pointer. – Andrew Hare May 29 '09 at 14:58
1

Lets say int a = 34, now a is pointing to just an integer in memory. Nothing else. Now how does the CLR know that when I box a, I need to Pass in [int32] as the parameter to box. I hope i am being clear.. How does the CLR map a particular chunk of memory to a value type object? – Gishu May 29 '09 at 15:07
@Gishu: The CLR doesn't do this - the compiler does. When the compiler boxes the int, it puts the instruction in as "box int32". This is always known at compile time, since value types cannot be inherited. The CLR just processes "box int32" on the object in the stack. – Reed Copsey May 29 '09 at 15:28
The CLR knows to pass "int32" because the compiler emits the instruction to do so. The compiler is smart enough to know the type of the value type as "int32" and as such emits the "box int32" instruction. – Andrew Hare May 29 '09 at 15:30
1

You've answered the posted question as in the title - although it's a specific case which involves boxing.. my underlying question was more of how is the type and method table determined for a value type if the object instance itself doesn't contain any type-identifiers. Found my answers later.. posted in this thread. Thanks. – Gishu Jun 01 '09 at 08:32
Wicked, that is all I have to say. – John Leidegren Aug 17 '10 at 16:41
Why aren't calls like `int foo; foo.GetType();` calls simply replaced by `typeof(int)` at compile time (since struct inheritance is not possible) ? – tigrou Oct 06 '15 at 11:08
Gishu I quite agree with you. I asked just the same chain of quesitons and do not get a clear answer. – Anton Lyhin Feb 05 '16 at 22:31
@tigrou Interestingly, that optimization can't work for nullables, because of their boxing behavior. `int? a = 0; a.GetType()` yields `typeof(int)`, not `typeof(int?)`. `int? a = null; a.GetType()` is a runtime error. If such an optimization were implemented, it would specifically have to exclude `Nullable<>`. – cdhowie Sep 08 '17 at 14:19

Gishu · Answer 2 · 2009-06-01T13:44:54.507

5

Maybe Andrew H. took this as obvious and tried hard to make me understand +1. my light-bulb moment came from Jon Skeet.. again (this time via his book which I happened to be reading.. and around the exact region where the answers lay.

C# is a statically typed. Each variable has a type and it is known at compile time.
Value types can't be inherited. As a result, the VT objects don't need to carry around extra type information (as opposed to Ref Type objects, each of which have an object type header since the variable type and value/object type may differ.)

Consider the snippet below. Although the variable type is BaseRefType, it points to an object of a more specialized type. For value types, since inheritance is outlawed, the variable type is the object's type.

BaseRefType r = new DerivedRefType(); 
ValueType v = new ValueType();

My missing piece was bullet#1.
<Snipped after J.Skeet's comment since it seems to be wrong>. There seems to be some magic that lets the compiler/runtime know the 'type of the variable' given any arbitrary variable. So the runtime somehow knows that ob is of MyStruct type, even though the VT object itself has no type information.

MyStruct ob = new MyStruct();
ob.WhoAmI();                          // no box ; defined in MyStruct
Console.WriteLine(ob.GetHashCode());  // no box ; overridden in ValueType
Console.WriteLine( ob.GetType() );    // box ; implemented in Object

Due to this, I am able to invoke methods defined in MyStruct (and ValueType for some reason) without boxing to a RefType.

edited Jun 01 '09 at 13:44

answered Jun 01 '09 at 08:23

Gishu

134,492
47
225
308

2

I can't say I'm an expert on the exact details here, but I suggest you look at the IL spec (ECMA-335) and in particular the "constrained" op code. That's what allows the virtual methods to be called without boxing, I believe. (Compile code which calls GetHashCode() vs GetType()). Having said that, your idea about it being two pointers is also wrong. The value of r is a reference to the object, and the object itself has a reference to its actual type. The value of v is just the value of the variable, but the compiler and runtime know the exact type anyway (due to lack of value type inheritance) – Jon Skeet Jun 01 '09 at 13:02
Yes I did see a constrained instruction in the IL for the second GetHashCode method call. So where is the type of a variable stored? in order for the compiler and runtime to determine ; given an arbitrary variable which just points to a RT Object on the managed heap or the beginning of a VT object. – Gishu Jun 01 '09 at 13:10
So did you found an answer to your question, that where did type of the variable stored? – waheed Feb 01 '10 at 22:12
1

@waheed: The type of the variable is known to the compiler (stored in whatever internal datastructure it used to house identifiers/symbols). The reason that RefType objects have to carry around their types is 'Base b = new Derived()'. Even though the var is of type Base, the object is of a diff type Derived. This is not the case for value types, since inheritance is disallowed. ValType v will always be an object of ValType. No need to carry that information around with every object. It is 'written in stone' at compile time. – Gishu Feb 02 '10 at 04:43
1

It maybe somewhat confused to use `ValueType` in examples because of `System.ValueType` that actually is a reference type. – Eldar Mar 17 '16 at 11:28

How is ValueType.GetType() able to determine the type of the struct?

2 Answers2

Linked