12

Here is fairly simple generic class. Generic parameter is constrained to be reference type. IRepository and DbSet also contain the same constraint.

public class Repository<TEntity> : IRepository<TEntity>
    where TEntity : class, IEntity
{
    protected readonly DbSet<TEntity> _dbSet;
    public void Insert(TEntity entity)
    {
        if (entity == null) 
        throw new ArgumentNullException("entity", "Cannot add null entity.");
        _dbSet.Add(entity);
    }
}

Compiled IL contains box instruction. Here is the release version (debug version also contains it though).

.method public hidebysig newslot virtual final 
    instance void  Insert(!TEntity entity) cil managed
{
  // Code size       38 (0x26)
  .maxstack  8
  IL_0000:  ldarg.1
  >>>IL_0001:  box        !TEntity
  IL_0006:  brtrue.s   IL_0018
  IL_0008:  ldstr      "entity"
  IL_000d:  ldstr      "Cannot add null entity."
  IL_0012:  newobj     instance void [mscorlib]System.ArgumentNullException::.ctor(string,
                                           string)
  IL_0017:  throw
  IL_0018:  ldarg.0
  IL_0019:  ldfld      class [EntityFramework]System.Data.Entity.DbSet`1<!0> class Repository`1<!TEntity>::_dbSet
  IL_001e:  ldarg.1
  IL_001f:  callvirt   instance !0 class [EntityFramework]System.Data.Entity.DbSet`1<!TEntity>::Add(!0)
  IL_0024:  pop
  IL_0025:  ret
} // end of method Repository`1::Insert

UPDATE:

With object.Equals(entity, default(TEntity)) it looks even worse:

  .maxstack  2
  .locals init ([0] !TEntity CS$0$0000)
  IL_0000:  ldarg.1
  >>>IL_0001:  box        !TEntity
  IL_0006:  ldloca.s   CS$0$0000
  IL_0008:  initobj    !TEntity
  IL_000e:  ldloc.0
  >>>IL_000f:  box        !TEntity
  IL_0014:  call       bool [mscorlib]System.Object::Equals(object,
                                object)
  IL_0019:  brfalse.s  IL_002b

UPDATE2:

For those who are interested, here is the code compiled by jit shown in debugger:

0cd5af28 55              push    ebp
0cd5af29 8bec            mov     ebp,esp
0cd5af2b 83ec18          sub     esp,18h
0cd5af2e 33c0            xor     eax,eax
0cd5af30 8945f0          mov     dword ptr [ebp-10h],eax
0cd5af33 8945ec          mov     dword ptr [ebp-14h],eax
0cd5af36 8945e8          mov     dword ptr [ebp-18h],eax
0cd5af39 894df8          mov     dword ptr [ebp-8],ecx
    //entity reference to [ebp-0Ch]
0cd5af3c 8955f4          mov     dword ptr [ebp-0Ch],edx
    //some debugger checks
0cd5af3f 833d9424760300  cmp     dword ptr ds:[3762494h],0
0cd5af46 7405            je      0cd5af4d  Branch
0cd5af48 e8e1cac25a      call    clr!JIT_DbgIsJustMyCode (67987a2e)
0cd5af4d c745fc00000000  mov     dword ptr [ebp-4],0
0cd5af54 90              nop

    //comparison or entity ref with  zero
0cd5af55 837df400        cmp     dword ptr [ebp-0Ch],0
0cd5af59 0f95c0          setne   al
0cd5af5c 0fb6c0          movzx   eax,al
0cd5af5f 8945fc          mov     dword ptr [ebp-4],eax
0cd5af62 837dfc00        cmp     dword ptr [ebp-4],0
    //if not zero, jump further
0cd5af66 7542            jne     0cd5afaa  Branch
    //throwing exception here      

The reason of this question is actually that NDepend warns about using boxing/unboxing. I was curious why it found boxing in some generic classes, and now it's clear.

mikalai
  • 1,746
  • 13
  • 23
  • 1
    Is it the same if you use object.Equals(entity, default(TEntity))? – Konrad Kokosa Dec 19 '13 at 14:06
  • I'm gonna take a stab in the dark here. IIRC, interfaces _can_ cause boxing of value types, so I wonder if that has something to do with it. Perhaps the compiler thinks it must box it in case a value-type is fed in for it to check against the `null` reference. I might be full of it though. :) EDIT: this is despite the constraint that it's a `class` (not a value type) but I'm not certain whether or not that's considered by the compiler and/or CLR and/or IL. – Chris Sinclair Dec 19 '13 at 14:10
  • 1
    I would also suspect, that both `object.Equals` and `==` operator is so _general_ that it boxes by default. Try `EqualityComparer.Default.Equals(entity, default(T))` as it should use generic method. – Konrad Kokosa Dec 19 '13 at 14:13
  • Similar question here, though I'm not sure if it answers your question: http://stackoverflow.com/questions/1400414/why-does-generic-method-with-constaint-of-t-class-result-in-boxing – Chris Sinclair Dec 19 '13 at 14:19
  • @KonradKokosa - with `entyty == default(T)` it's the same as first case. With `object.Equals(entity, default(TEntity))` it looks even worse... – mikalai Dec 19 '13 at 14:21
  • Doing some simplified tests, I get the suspicion that this is a case of the `class` constraint not having a bearing on how the compiled IL is constructed. It assumes that `TEntity` _could_ be a value type, and when `== null` is applied (so really, `Object.ReferenceEquals`) it needs to box it to `object`. I _suspect_ that when the method is JITed based on the actual closed generic type, the box and/or null check might be optimized though I don't have the tools with me now to check this. Just a guess. – Chris Sinclair Dec 19 '13 at 14:24
  • @mikalai and what in case of EqualityComparer<>.Default? – Konrad Kokosa Dec 19 '13 at 14:24
  • @KonradKokosa: Using the `EqualityComparer<>.Default` produces similar code but without the boxing instruction. Not sure if this will _still_ box later on or not during the rest of its implementation; that is, buries the box in the implementation. – Chris Sinclair Dec 19 '13 at 14:26
  • @KonradKokosa - well... using EqualityComparer removed box. It's clear - no check for null is performed and entity just goes as an argument to Equals. But I also checked IL for GenericEqualityComparer, which (surprise) boxes both arguments. I'm totally lost now) – mikalai Dec 19 '13 at 14:26
  • @ChrisSinclair - thank you. Looks like constraint isn't respected, and that's strange. csc generates the code and should be aware of it. – mikalai Dec 19 '13 at 14:29
  • @mikalai, interesting findings then but all this seems to be irrelevant due to romkyns answer. So it simply seems that compiler boxes by default to not worry about this topic. – Konrad Kokosa Dec 19 '13 at 14:30
  • The box instruction is generated because the CLR *requires* that a box instruction be generated. – Eric Lippert Dec 19 '13 at 16:16

2 Answers2

17

I ran into a very relevant comment when reviewing the C# compiler source code that generates BOX instructions. The fncbind.cpp source file has this comment, not otherwise directly related to this particular code:

// NOTE: for the flags, we have to use EXF_FORCE_UNBOX (not EXF_REFCHECK) even when
// we know that the type is a reference type. The verifier expects all code for
// type parameters to behave as if the type parameter is a value type.
// The jitter should be smart about it....

So it is there because the verifier requires it.

And yes, the jitter is smart about it. It simply emits no code at all for the BOX instruction.

Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
  • That's quite interesting - so it was decided to simplify IL approach, as soon as (probably) unboxing reference type check already existed in jit compiler. – mikalai Dec 19 '13 at 19:55
  • Ah, good find. I tried really hard to interpret the ECMA spec this way, but it just didn't look like the spec requires this. Relevant quote: _"the type tracked by verification is always “boxed” `typeTok` for generic parameters, regardless of whether the actual type at runtime is a value or reference type."_ – Roman Starkov Dec 20 '13 at 12:16
12

The ECMA spec states this about the box instruction:

Stack transition: ..., val -> ..., obj

...

If typeTok is a generic parameter, the behavior of box instruction depends on the actual type at runtime. If this type [...] is a reference type then val is not changed.

What it's saying is that the compiler can assume that it's safe to box a reference type. So with generics, the compiler has two choices: emit the code that is guaranteed to work regardless of how the generic type is constrained, or optimize the code and omit redundant instructions where it can prove them to be unnecessary.

The Microsoft C# compiler, in general, tends to choose the simpler approach and leave all optimization to the JIT stage. To me, it looks like your example is exactly that: not optimizing something because implementing an optimization takes time, and saving this box instruction probably has very little value in practice.

C# allows even an unconstrained generic-typed value to be compared to null, so the compiler must support this general case. The easiest way to implement this general case is to use the box instruction, which does all the heavy-lifting of handling reference, value and nullable types, correctly pushing either a reference or a null value onto the stack. So the easiest thing for the compiler to do is to issue box regardless of the constraints, and then compare the value to zero (brtrue).

Roman Starkov
  • 59,298
  • 38
  • 251
  • 324
  • 1
    Do note Hans's answer, which sounds like the ultimate reason for the instruction being there and should probably be the accepted answer. PEVerify is like the authority on what is correct IL, and if PEVerify rejects something for whatever reason then it's as good as invalid (important in e.g. low trust scenarios). – Roman Starkov Aug 12 '15 at 21:23