7

I've noticed that the startup time for Roslyn parsing/compilation is a fairly significant one-time cost. EDIT: I am using the Roslyn CTP MSI (the assembly is in the GAC). Is this expected? Is there any workaround?

Running the code below takes almost the same amount of time with 1 iteration (~3 seconds) as with 300 iterations (~3 seconds).

[Test]
public void Test()
{
    var iters = 300;
    foreach (var i in Enumerable.Range(0, iters))
    {
        // Parse the source file using Roslyn
        SyntaxTree syntaxTree = SyntaxTree.ParseText(@"public class Foo" + i + @" { public void Exec() { } }");

        // Add all the references we need for the compilation
        var references = new List<MetadataReference>();
        references.Add(new MetadataFileReference(typeof(int).Assembly.Location));

        var compilationOptions = new CompilationOptions(outputKind: OutputKind.DynamicallyLinkedLibrary);

        // Note: using a fixed assembly name, which doesn't matter as long as we don't expect cross references of generated assemblies
        var compilation = Compilation.Create("SomeAssemblyName", compilationOptions, new[] {syntaxTree}, references);

        // Generate the assembly into a memory stream
        var memStream = new MemoryStream();

        // if we comment out from this line and down, the runtime drops to ~.5 seconds
        EmitResult emitResult = compilation.Emit(memStream);

        var asm = Assembly.Load(memStream.GetBuffer());
        var type = asm.GetTypes().Single(t => t.Name == "Foo" + i);
    }
}
Bob Albright
  • 2,242
  • 2
  • 25
  • 32
  • 2
    I'm not entirely surprised - bear in mind that it's got to load all the Roslyn assemblies and there's probably a lot of JIT compilation going on too. You might want to try profiling it to see what happens. – Jon Skeet Mar 12 '13 at 15:33
  • Indeed, a profiler is your friend. Also, what happens if you take out that Assembly.Load? That won't be helping perf at all. – Jason Malinowski Mar 12 '13 at 15:55
  • 1
    How are you getting the Roslyn binaries on the system? If you are getting the NuGet package, then this is likely JIT time. If you have installed the CTP MSI, it will GAC and NGen the assemblies, and you will likely see better startup performance. – Kevin Pilch Mar 12 '13 at 16:16
  • @KevinPilch-Bisson I was running via the NuGet package but I've changed my code to reference the CTP MSI and I'm still getting ~3 second startup times. – Bob Albright Mar 12 '13 at 18:40
  • @JasonMalinowski from profiling this code almost all of the time is from the call to "compilation.Emit". The Assembly.Load is not significant. – Bob Albright Mar 12 '13 at 18:41
  • Are you able to upload a profiler trace somewhere? Otherwise all we can do is speculate. If the iteration count doesn't really matter, then that would imply the first one is the expensive one. The first time we emit we'd have to go import metadata from mscorlib, which would be cached for subsequent loops. – Jason Malinowski Mar 13 '13 at 05:48
  • @JasonMalinowski Here's my DotTrace snapshot: http://filebin.ca/a3ims2tUa7V/roslyn_dotTrace.dtp I'm sure it's related to the first iteration; I don't have enough experience with .NET to know if there's any way around this time. – Bob Albright Mar 13 '13 at 13:44
  • @BobAlbright, did you ever figure out a way to address the slow first iteration when calling compilation.Emit()? I'm having the same issue you did. Peter's answer below is no longer an option, as that overload of Emit has been removed from Roslyn (http://stackoverflow.com/a/22977158/2962475) – Jordan Kohl Apr 10 '14 at 22:06
  • 1
    @JordanKohl Unfortunately, no. We ended up not using Roslyn much beyond some initial investigation. Good luck! – Bob Albright Apr 10 '14 at 22:24

4 Answers4

2

I think one issue is using a memory stream, instead you should try using a dynamic module and ModuleBuilder instead. Overall the code is executing faster but still has a heavier first load scenario. I'm pretty new to Roslyn myself so I'm not sure why this is faster but here is the changed code.

        var iters = 300;
        foreach (var i in Enumerable.Range(0, iters))
        {
            // Parse the source file using Roslyn
            SyntaxTree syntaxTree = SyntaxTree.ParseText(@"public class Foo" + i + @" { public void Exec() { } }");

            // Add all the references we need for the compilation
            var references = new List<MetadataReference>();
            references.Add(new MetadataFileReference(typeof(int).Assembly.Location));

            var compilationOptions = new CompilationOptions(outputKind: OutputKind.DynamicallyLinkedLibrary);

            // Note: using a fixed assembly name, which doesn't matter as long as we don't expect cross references of generated assemblies
            var compilation = Compilation.Create("SomeAssemblyName", compilationOptions, new[] { syntaxTree }, references);

            var assemblyBuilder = AppDomain.CurrentDomain.DefineDynamicAssembly(new System.Reflection.AssemblyName("CustomerA"),
            System.Reflection.Emit.AssemblyBuilderAccess.RunAndCollect);

            var moduleBuilder = assemblyBuilder.DefineDynamicModule("MyModule");

            System.Diagnostics.Stopwatch watch = new System.Diagnostics.Stopwatch();
            watch.Start();

            // if we comment out from this line and down, the runtime drops to ~.5 seconds
            var emitResult = compilation.Emit(moduleBuilder);

            watch.Stop();

            System.Diagnostics.Debug.WriteLine(watch.ElapsedMilliseconds);

            if (emitResult.Diagnostics.LongCount() == 0)
            {
                var type = moduleBuilder.GetTypes().Single(t => t.Name == "Foo" + i);

                System.Diagnostics.Debug.Write(type != null);
            }
        }

By using this technique the compilation took just 96 milliseconds, on subsequent iterations it takes around 3 - 15ms. So I think you could be right in terms of the first load scenario adding some overhead.

Sorry I can't explain why it's faster! I'm just researching Roslyn myself and will do more digging later tonight to see if I can find any more evidence of what the ModuleBuilder provides over the memorystream.

Peter
  • 1,776
  • 13
  • 20
  • This answer is now deprecated: https://stackoverflow.com/questions/22974473/using-roslyn-emit-method-with-a-modulebuilder-instead-of-a-memorystream – Mel O'Hagan May 02 '20 at 21:54
1

I have came across the same issue using the Microsoft.CodeDom.Providers.DotNetCompilerPlatform package of ASP.net. It turns out this package launches csc.exe which uses VBCSCompiler.exe as a compilation server. By default the VBCSCompiler.exe server lives for 10 seconds and its boot time is of about 3 seconds. This explains why it takes about the same time to run your code once or multiple times. It seems like Microsoft is using this server as well in Visual Studio to avoid paying an extra boot time each time you run a compilation.

With the this package You can monitor your processes and will find a command line which looks like csc.exe /keepalive:10

The nice part is that if this server stays alive (even between two sessions of your application), you can get a fast compilation all the times.

Unfortunately, the Roslyn package is not really customizable and the easiest way I found to change this keepalive constant is to use the reflection to set non public variables value. On my side, I defined it to a full day as it always keep the same server even if I close and restart my application.

    /// <summary>
    /// Force the compiler to live for an entire day to avoid paying for the boot time of the compiler.
    /// </summary>
    private static void SetCompilerServerTimeToLive(CSharpCodeProvider codeProvider, TimeSpan timeToLive)
    {
        const BindingFlags privateField = BindingFlags.NonPublic | BindingFlags.Instance;

        var compilerSettingField = typeof(CSharpCodeProvider).GetField("_compilerSettings", privateField);
        var compilerSettings = compilerSettingField.GetValue(codeProvider);

        var timeToLiveField = compilerSettings.GetType().GetField("_compilerServerTimeToLive", privateField);
        timeToLiveField.SetValue(compilerSettings, (int)timeToLive.TotalSeconds);
    }
0

When you call Compilation.Emit() it is the first time you actually need metadata, so the metadata file access occurs. After that, its cached. Though that should not account for 3secs just for mscorlib.

Matt Warren
  • 1,956
  • 14
  • 15
0

tldr: NGEN-ing roslyn dlls shaves off 1.5s off of the initial compilation/execution time (in my case from ~2s to ~0.5s)


Investigated this just now.

With a brand new console application and a nuget reference to Microsoft.CodeAnalysis.Scripting, the initial execution of a small snippet ("1+2") took about 2s, while subsequent ones were a lot faster - around 80ms (still a bit high for my taste but that's a different topic).

Perfview revealed that the delay was predominantly due to jitting:

enter image description here

  • Microsoft.CodeAnalysis.CSharp.dll: 941ms (3,205 methods jitted)
  • Microsoft.CodeAnalysis.dll 426ms (1,600 methods jitted)

I used ngen on Microsoft.CodeAnalysis.CSharp.dll (making sure to specify the /ExeCondig:MyApplication.exe because of the binding redirects in app.config) and got a nice performance improvement, the first-execution time fell to ~580ms.

This of course would need to be done on end user machines. In my case, I'm using Wix as the installer for my software and there's support for NGEN-ing files at install time.

anakic
  • 2,746
  • 1
  • 30
  • 32