20

(I know, this is a ridiculously long question. I tried to separate the question from my investigation so far, so it's slightly easier to read.)

I'm running my unit tests using MSTest.exe. Occasionally, I see this test error:

On the individual unit test method: "The agent process was stopped while the test was running."

On the entire test run:

One of the background threads threw exception: 
System.NullReferenceException: Object reference not set to an instance of an object.
   at System.Runtime.InteropServices.Marshal.ReleaseComObject(Object o)
   at System.Management.Instrumentation.MetaDataInfo.Dispose()
   at System.Management.Instrumentation.MetaDataInfo.Finalize()

So, here's what I think I need to do: I need to track down what is causing the error in MetaDataInfo, but I'm drawing a blank. My unit test suite takes over half an hour to run, and the error doesn't happen every time, so it's hard to get it to reproduce.

Has anyone else seen this type of failure in running unit tests? Were you able to track it down to a specific component?

Edit:

The code under test is a mix of C#, C++/CLI, and a little bit of unmanaged C++ code. The unmanaged C++ is used only from the C++/CLI, never directly from the unit tests. The unit tests are all C#.

The code under test will be running in a standalone Windows Service, so there's no complication from ASP.net or anything like that. In the code under test, there's threads starting & stopping, network communication, and file I/O to the local hard drive.


My investigation so far:

I spent some time digging around the multiple versions of the System.Management assembly on my Windows 7 machine, and I found the MetaDataInfo class in System.Management that's in my Windows directory. (The version that's under Program Files\Reference Assemblies is much smaller, and doesn't have the MetaDataInfo class.)

Using Reflector to inspect this assembly, I found what seems to be an obvious bug in MetaDataInfo.Dispose():

// From class System.Management.Instrumentation.MetaDataInfo:
public void Dispose()
{
    if (this.importInterface == null) // <---- Should be "!="
    {
        Marshal.ReleaseComObject(this.importInterface);
    }
    this.importInterface = null;
    GC.SuppressFinalize(this);
}

With this 'if' statement backwards, MetaDataInfo will leak the COM object if present, or throw a NullReferenceException if not. I've reported this on Microsoft Connect: https://connect.microsoft.com/VisualStudio/feedback/details/779328/

Using reflector, I was able to find all uses of the MetaDataInfo class. (It's an internal class, so just searching the assembly should be a complete list.) There is only one place it is used:

public static Guid GetMvid(Assembly assembly)
{
    using (MetaDataInfo info = new MetaDataInfo(assembly))
    {
        return info.Mvid;
    }
}

Since all uses of MetaDataInfo are being properly Disposed, here's what's happening:

  • If MetaDataInfo.importInterface is not null:
    • static method GetMvid returns MetaDataInfo.Mvid
    • The using calls MetaDataInfo.Dispose
      • Dispose leaks the COM object
      • Dispose sets importInterface to null
      • Dispose calls GC.SuppressFinalize
    • Later, when the GC collects the MetaDataInfo, the finalizer is skipped.
  • .
  • If MetaDataInfo.importInterface is null:
    • static method GetMvid gets a NullReferenceException calling MetaDataInfo.Mvid.
    • Before the exception propagates up, the using calls MetaDataInfo.Dispose
      • Dispose calls Marshal.ReleaseComObject
        • Marshal.ReleaseComObject throws a NullReferenceException.
      • Because an exception is thrown, Dispose doesn't call GC.SuppressFinalize
    • The exception propagates up to GetMvid's caller.
    • Later, when the GC collects the MetaDataInfo, it runs the Finalizer
      • Finalize calls Dispose
        • Dispose calls Marshal.ReleaseComObject
          • Marshal.ReleaseComObject throws a NullReferenceException, which propagates all the way up to the GC, and the application is terminated.

For what it's worth, here's the rest of the relevant code from MetaDataInfo:

public MetaDataInfo(string assemblyName)
{
    Guid riid = new Guid(((GuidAttribute) Attribute.GetCustomAttribute(typeof(IMetaDataImportInternalOnly), typeof(GuidAttribute), false)).Value);
    // The above line retrieves this Guid: "7DAC8207-D3AE-4c75-9B67-92801A497D44"
    IMetaDataDispenser o = (IMetaDataDispenser) new CorMetaDataDispenser();
    this.importInterface = (IMetaDataImportInternalOnly) o.OpenScope(assemblyName, 0, ref riid);
    Marshal.ReleaseComObject(o);
}

private void InitNameAndMvid()
{
    if (this.name == null)
    {
        uint num;
        StringBuilder szName = new StringBuilder {
            Capacity = 0
        };
        this.importInterface.GetScopeProps(szName, (uint) szName.Capacity, out num, out this.mvid);
        szName.Capacity = (int) num;
        this.importInterface.GetScopeProps(szName, (uint) szName.Capacity, out num, out this.mvid);
        this.name = szName.ToString();
    }
}

public Guid Mvid
{
    get
    {
        this.InitNameAndMvid();
        return this.mvid;
    }
}

Edit 2:

I was able to reproduce the bug in the MetaDataInfo class for Microsoft. However, my reproduction is slightly different from the issue I'm seeing here.

  • Reproduction: I try to create a MetaDataInfo object on a file that isn't a managed assembly. This throws an exception from the constructor before importInterface is initialized.
  • My issue with MSTest: MetaDataInfo is constructed on some managed assembly, and something happens to make importInterface null, or to exit the constructor before importInterface is initialized.
    • I know that MetaDataInfo is created on a managed assembly, because MetaDataInfo is an internal class, and the only API that calls it does so by passing the result of Assembly.Location.

However, re-creating the issue in Visual Studio meant that it downloaded the source to MetaDataInfo for me. Here's the actual code, with the original developer's comments.

public void Dispose()
{ 
    // We implement IDisposable on this class because the IMetaDataImport
    // can be an expensive object to keep in memory. 
    if(importInterface == null) 
        Marshal.ReleaseComObject(importInterface);
    importInterface = null; 
    GC.SuppressFinalize(this);
}

~MetaDataInfo() 
{
    Dispose(); 
} 

The original code confirms what was seen in reflector: The if statement is backwards, and they shouldn't be accessing the managed object from the Finalizer.

I said before that because it was never calling ReleaseComObject, that it was leaking the COM object. I read up more on the use of COM objects in .Net, and if I understand it properly, that was incorrect: The COM object isn't released when Dispose() is called, but it is released when the garbage collector gets around to collecting the Runtime Callable Wrapper, which is a managed object. Even though it's a wrapper for an unmanaged COM object, the RCW is still a managed object, and the rule about "don't access managed objects from the finalizer" should still apply.

David Yaw
  • 27,383
  • 4
  • 60
  • 93
  • 1
    If your analysis is correct, it should be reported on "connect" – Marc Gravell Feb 15 '13 at 21:15
  • It would be interesting to know what your tests are doing, and what the code being tested is doing. – John Saunders Feb 15 '13 at 21:16
  • Yes, that is a bug. Calling ReleaseComObject() in the finalizer is a bug as well. This just never got diagnosed before, it only crashes in very select circumstances, importInterface isn't normally null. You'd better avoid that circumstance ;) – Hans Passant Feb 15 '13 at 21:18
  • 4
    @MarcGravell: Already done: https://connect.microsoft.com/VisualStudio/feedback/details/779328/ – David Yaw Feb 15 '13 at 21:21
  • See edit. The code is C++, C++/CLI, and C#. The unit tests are C#. No ASP.Net or anything like that. The code is starting & stopping threads, doing network communication, and disk I/O to the local disk. – David Yaw Feb 15 '13 at 21:43
  • I was able to reproduce the immediate bug, as MS requested on Connect, though the repro & the original have different root causes. – David Yaw Feb 23 '13 at 00:01

3 Answers3

1

Try to add the following code to your class definition:

bool _disposing = false  // class property

public void Dispose()
{
    if( !disposing ) 
        Marshal.ReleaseComObject(importInterface);
    importInterface = null; 
    GC.SuppressFinalize(this);

    disposing = true;
}
Carsten
  • 11,287
  • 7
  • 39
  • 62
  • It's not my class definition. This would throw an exception anyway, if importInterface were null after the constructor ran. Switching the if statement from `== null` to `!= null` and not calling it from the Finalizer is the proper fix. – David Yaw Feb 25 '13 at 13:36
0

If MetaDataInfo uses the IDisposable pattern, then there should also be a finalizer (~MetaDataInfo() in C#). The using statement will make sure to call Dispose(), which sets the importInterface to null. Then when the GC is ready to finalize, the ~MetaDataInfo() is called, which would normally call Dispose (or rather the overload taking a bool disposing: Dispose(false)).

I would say that this error should turn up quite often.

AroglDarthu
  • 1,021
  • 8
  • 17
  • Most of the time, `importinterface` is not null, so the exception never happens, and the only consequence is that the COM object is cleaned up when the garbage collector gets to it, not when Dispose is run. – David Yaw Feb 22 '13 at 23:35
  • Weird :-) Anyway, what I wanted to point out is that the finalizer calls Dispose, not the other way around. – AroglDarthu Feb 23 '13 at 18:12
0

Are you trying to fix this for your tests? If so, rewrite your using. Don't dispose of it yourself but write some code to use reflection to access the private fields and dispose of them correctly and then call GC.SuppressFinalize to prevent the finalizer from running.

As a brief aside (loved your investigation btw) you state that Dispose calls Finalize. It's the other way round, Finalize when invoked by the GC calls Dispose.

Quibblesome
  • 25,225
  • 10
  • 61
  • 100
  • I don't use this class directly (it's an internal class within System.Management that I can only get to through reflection), nor do I make any calls from my code that would call this. I suspect that it's being called from MSTest.exe, in order to report on the assemblies being used, or something like that. I suppose I should attach a debugger to MSTest.exe and watch for MetaDataInfo to be instantiated. – David Yaw Feb 23 '13 at 00:17
  • Good catch on which method calls which. Fixed. End result is the same bug & same crash, though. – David Yaw Feb 23 '13 at 00:18
  • Any chance you can't just port this test to NUnit for the time being? As far as I remember you just need to change the name of the attributes to make it compatible and link to different assemblies. – Quibblesome Feb 23 '13 at 00:22
  • This doesn't happen on every test run, maybe about 10% of test runs, so it is something I can live with. My company has build servers set up to run MSTest, not NUnit, so it'd be a hassle to switch, but I'll bite the bullet if it starts failing more often. – David Yaw Feb 23 '13 at 00:31