18

When you have code like the following:

static T GenericConstruct<T>() where T : new()
{
    return new T();
}

The C# compiler insists on emitting a call to Activator.CreateInstance, which is considerably slower than a native constructor.

I have the following workaround:

public static class ParameterlessConstructor<T>
    where T : new()
{
    public static T Create()
    {
        return _func();
    }

    private static Func<T> CreateFunc()
    {
        return Expression.Lambda<Func<T>>( Expression.New( typeof( T ) ) ).Compile();
    }

    private static Func<T> _func = CreateFunc();
}

// Example:
// Foo foo = ParameterlessConstructor<Foo>.Create();

But it doesn't make sense to me why this workaround should be necessary.

Nathan W
  • 54,475
  • 27
  • 99
  • 146
user46267
  • 181
  • 1
  • 3
  • I noticed the same thing... but I don't know why. – Chuck Conway Dec 15 '08 at 06:27
  • I am using snippet compiler & the compiler doesn't throw any error. Also, the constructor is called when new T() is called. – shahkalpesh Dec 15 '08 at 06:42
  • 1
    @shahkalpesh: No-one said there'd be an error. The point is that Activator.CreateInstance is slower than the delegate form. – Jon Skeet Dec 15 '08 at 06:56
  • @Jon: Is it at the IL level, the call to Activator.CreateInstance inserted? If so, I did not get it from the question. – shahkalpesh Dec 15 '08 at 07:07
  • @shahkalpesh: Yes. Run Reflector or ildasm over code using new T() (with a new T() constraint, not a struct constraint) and you'll see it. – Jon Skeet Dec 15 '08 at 08:18
  • BTW All VB.NET compilers I can test always produce the `Activator::CreateInstance` call for general, `class` and `structure` constraints. – Mark Hurd May 16 '14 at 01:22

5 Answers5

9

I suspect it's a JITting problem. Currently, the JIT reuses the same generated code for all reference type arguments - so a List<string>'s vtable points to the same machine code as that of List<Stream>. That wouldn't work if each new T() call had to be resolved in the JITted code.

Just a guess, but it makes a certain amount of sense.

One interesting little point: in neither case does the parameterless constructor of a value type get called, if there is one (which is vanishingly rare). See my recent blog post for details. I don't know whether there's any way of forcing it in expression trees.

nawfal
  • 70,104
  • 56
  • 326
  • 368
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
8

This is likely because it is not clear whether T is a value type or reference type. The creation of these two types in a non-generic scenario produce very different IL. In the face of this ambiguity, C# is forced to use a universal method of type creation. Activator.CreateInstance fits the bill.

Quick experimentation appears to support this idea. If you type in the following code and examine the IL, it will use initobj instead of CreateInstance because there is no ambiguity on the type.

static void Create<T>()
    where T : struct
{
    var x = new T();
    Console.WriteLine(x.ToString());
}

Switching it to a class and new() constraint though still forces an Activator.CreateInstance.

JaredPar
  • 733,204
  • 149
  • 1,241
  • 1,454
  • 6
    I guess the immediate followup question would be "why isn't there an appropriate IL instruction for creating an instance of a generic type with an appropriate constraint?" It's not like they couldn't have built that in from the start :) – Jon Skeet Dec 15 '08 at 08:18
  • Agreed it really seems like they implemented an API instead of an IL instruction. The comment on the MSDN doc page for Activator.CreateInstance specifically says that it should be called for this scenario. Odd choice, I'm sure there's a good reason. – JaredPar Dec 15 '08 at 08:27
  • I suspect the reason is to increase JIT'd code sharing. If you had a direct call to a type's constructor in the JIT'd code, then you couldn't share that JIT'd code with another instantiation for a different type, e.g. 'T Create<T>() where T : new() {return new T();}' would share machine code for Create<string>() and Create<ArrayList>(). – jonp Jun 10 '09 at 19:24
  • 1
    @JonSkeet Looking back at this five years later, it seems as though this is a growing trend: using static methods to mark places where JIT should take over, as opposed to creating new instructions. A good example would be CER. – Zenexer Jun 15 '13 at 11:02
  • 2
    Just a quick note that this is not true anymore sadly. Regardless of constraint, Roslyn ouputs `Activator.CreateInstance`. – nawfal Jun 30 '16 at 10:38
3

Why is this workaround necessary?

Because the new() generic constraint was added to C# 2.0 in .NET 2.0.

Expression<T> and friends, meanwhile, were added to .NET 3.5.

So your workaround is necessary because it wasn't possible in .NET 2.0. Meanwhile, (1) using Activator.CreateInstance() was possible, and (2) IL lacks a way to implement 'new T()', so Activator.CreateInstance() was used to implement that behavior.

jonp
  • 13,512
  • 5
  • 45
  • 60
2

This is a little bit faster, since the expression is only compiled once:

public class Foo<T> where T : new()
{
    static Expression<Func<T>> x = () => new T();
    static Func<T> f = x.Compile();

    public static T build()
    {
        return f();
    }
}

Analyzing the performance, this method is just as fast as the more verbose compiled expression and much, much faster than new T() (160 times faster on my test PC) .

For a tiny bit better performance, the build method call can be eliminated and the functor can be returned instead, which the client could cache and call directly.

public static Func<T> BuildFn { get { return f; } }
nawfal
  • 70,104
  • 56
  • 326
  • 368
2

Interesting observation :)

Here is a simpler variation on your solution:

static T Create<T>() where T : new()
{
  Expression<Func<T>> e = () => new T();
  return e.Compile()();
}

Obviously naive (and possible slow) :)

leppie
  • 115,091
  • 17
  • 196
  • 297
  • 2
    I don't think that will work, because it's specifically "new T()" that his workaround is trying to avoid. – Joel Mueller Jun 10 '09 at 19:22
  • 1
    @Joel Mueller Actually it does work. Expression tree contains NewExpression here. – ghord Jun 06 '13 at 15:25
  • 1
    Yes, it's an Expression of Func, not a Func. The "() => new T()" is not producing IL (thus producing Activator.CreateInstance()), but an expression tree which in turn is compiled at runtime when the T is known. The only problem here is that each time you call this function, you recompile this statement. – Thanasis Ioannidis Dec 05 '13 at 13:33
  • 1
    This is brilliant, didnt know this could work. For the uninformed, the compiled IL will have instructions to `Expression.New` and not `Activator.CreateInstance`. Feels like cheating though..quite unintuitive and less obvious for me. – nawfal Jul 01 '16 at 01:03