Every non-array non-string object stored on the heap contains an 8- or 16-byte header (sizes for 32/64-bit systems), followed by the contents of that object's public and private fields. Arrays and strings have the above header, plus some more bytes defining the length of the array and size of each element (and possibly the number of dimensions, length of each extra dimension, etc.), followed by all of the fields of the first element, then all the fields of the second, etc. Given an reference to an object, the system can easily examine the header and determine what type it is.
Reference-type storage locations hold a four- or eight-byte value which uniquely identifies an object stored on the heap. In present implementations, that value is a pointer, but it's easier (and semantically equivalent) to think of it as an "object ID".
Value-type storage locations hold the contents of the value type's fields, but do not have any associated header. If code declares a variable of type Int32
, there's no need to need to store information with that Int32
saying what it is. The fact that that location holds an Int32
is effectively stored as part of the program, and so it doesn't have to be stored in the location itself. This an represent a big savings if, e.g., one has a million objects each of which have a field of type Int32
. Each of the objects holding the Int32
has a header which identifies the class that can operate it. Since one copy of that class code can operate on any of the million instances, having the fact that the field is an Int32
be part of the code is much more efficient than having the storage for every one of those fields include information about what it is.
Boxing is necessary when a request is made to pass the contents of a value-type storage location to code which doesn't know to expect that particular value type. Code which expects objects of unknown type can accept a reference to an object stored on the heap. Since every object stored on the heap has a header identifying what type of object it is, code can use that header whenever it's necessary to use an object in a way which would require knowing its type.
Note that in .net, it is possible to declare what are called generic classes and methods. Each such declaration automatically generates a family of classes or methods which are identical except fort he type of object upon which they expect to act. If one passes an Int32
to a routine DoSomething<T>(T param)
, that will automatically generate a version of the routine in which every instance of type T
is effectively replaced with Int32
. That version of the routine will know that every storage location declared as type T
holds an Int32
, so just as in the case where a routine was hard-coded to use an Int32
storage location, it will not be necessary to store type information with those locations themselves.