4

I'm currently working on some .Net based software (.Net Framework 3.5 SP1) that integrates with HP Quality Center 10.0 through it's COM Client API (often referred to as TDApiOle80 or TDApiOle80.TDConnection).

We are using XUnit 1.6.1.1521 and Gallio 3.1.397.0 (invoked from an msbuild file)

We go through a process of:

  • Creating a connection
  • Running a test
  • Closing connection
  • Disposing
  • Forcing a GC.Collection() / GC.AwaitingPendingFinalizers()

For each integration test - and each integration test is run with a timeout configured in it's Fact.

The problem we have is that it appears after a few tests (say about 10 or so) Quality Center blocks indefinitely when called - and the whole of Gallio freezes and will no longer respond.

Originally we discovered that xunit.net only applied it's timeout to the code within the fact - so it would wait indefinitely for the constructor or dispose methods to complete - so we moved that logic into the body of the tests just to confirm... but this has not solved the problem (will still hang after runnin a certain number of tests).

The same thing happens when using TestDriven.Net - can run 1 or a few tests interactively, but more then about 10 tests and the whole run freezes - and our only choice is to kill the ProcessInvocation86.exe process used by TD.Net.

Does anyone have any tips/tricks on either how to stop this happening all together, or to at least insulate my integration tests from these kinds of problems - so that the tests where the QC API blocks indefinitely, the test will fail with a timeout and allow Gallio to move to the next test.

Update

The hint towards using an STA thread has helped move the issue forward a bit - via a custom XUnit.Net attribute we now launch the test in it's own STA thread. This has stopped Gallio/TestDriven.Net from locking up entirely, so we can include running the integration tests on our hudson build server.

    public class StaThreadFactAttribute : FactAttribute
    {
        const int DefaultTime = 30000; // 30 seconds

        public StaThreadFactAttribute()
        {
            Timeout = DefaultTime;
        }

        protected override System.Collections.Generic.IEnumerable<Xunit.Sdk.ITestCommand> EnumerateTestCommands(Xunit.Sdk.IMethodInfo method)
        {
            int timeout = Timeout;

            Timeout = 0;

            var commands = base.EnumerateTestCommands(method).ToList();

            Timeout = timeout;

            return commands.Select(command => new StaThreadTimeoutCommand(command, Timeout, method)).Cast<ITestCommand>();
        }
    }

    public class StaThreadTimeoutCommand : DelegatingTestCommand
    {
        readonly int _timeout;
        readonly IMethodInfo _testMethod;

        public StaThreadTimeoutCommand(ITestCommand innerComand, int timeout, IMethodInfo testMethod)
            : base(innerComand)
        {
            _timeout = timeout;
            _testMethod = testMethod;
        }

        public override MethodResult Execute(object testClass)
        {
            MethodResult result = null;

            ThreadStart work = delegate
                                                    {
                                                        try
                                                        {
                                                            result = InnerCommand.Execute(testClass);
                                                            var disposable = testClass as IDisposable;
                                                            if (disposable != null) disposable.Dispose();
                                                        }
                                                        catch (Exception ex)
                                                        {
                                                            result = new FailedResult(_testMethod, ex, this.DisplayName);
                                                        }
                                                    };

            var thread = new Thread(work);

            thread.SetApartmentState(ApartmentState.STA); //Set the thread to STA

            thread.Start();

            if (!thread.Join(_timeout))
            {
                return new FailedResult(_testMethod, new Xunit.Sdk.TimeoutException((long)_timeout), base.DisplayName);
            }

            return result;
        }
    }

Instead we now see output like this when running the tests with TestDriven.Net - incidentally running the same suite a few times will either result in all tests passing, or normally just 1 or two of the tests failing. And after the first failure, the second failure results in this "Error while unloading appdomain" issue.

Test 'IntegrationTests.Execute_Test1' failed: Test execution time exceeded: 30000ms

Test 'T:IntegrationTests.Execute_Test2' failed: Error while unloading appdomain. (Exception from HRESULT: 0x80131015) System.CannotUnloadAppDomainException: Error while unloading appdomain. (Exception from HRESULT: 0x80131015) at System.AppDomain.Unload(AppDomain domain) at Xunit.ExecutorWrapper.Dispose() at Xunit.Runner.TdNet.TdNetRunner.TestDriven.Framework.ITestRunner.RunMember(ITestListener listener, Assembly assembly, MemberInfo member) at TestDriven.TestRunner.AdaptorTestRunner.Run(ITestListener testListener, ITraceListener traceListener, String assemblyPath, String testPath) at TestDriven.TestRunner.ThreadTestRunner.Runner.Run()

4 passed, 2 failed, 0 skipped, took 50.42 seconds (xunit).

I'm still yet to establish why the Quality Center API is hanging indefinitely at random - will investigate this further shortly.

Update 27/07/2010

I've finally established the cause of the hanging - here's the problematic code:

connection = new TDConnection();
connection.InitConnectionEx(credentials.Host);
connection.Login(credentials.User, credentials.Password);
connection.Connect(credentials.Domain, credentials.Project);
connection.ConnectProjectEx(credentials.Domain, credentials.Project, credentials.User, credentials.Password);

It appears that calling Connect followed by ConnectProjectEx has a chance of blocking (but it's non-deterministic). Removing the redundant connection calls seems to have increased the stability of the testing dramatically - correct connection code:

connection = new TDConnection();
connection.InitConnectionEx(credentials.Host);
connection.ConnectProjectEx(credentials.Domain, credentials.Project, credentials.User, credentials.Password);

Having inherited the codebase I didn't give the connection code much thought.

One thing I have yet to figure out is why even with the timeout code included above, the Thread.Join(timeout) never returns. You can attach a debugger and it just shows the test thread is in a joining/wait operation. Perhaps something do with executing in an STA thread?

Bittercoder
  • 11,753
  • 10
  • 58
  • 76
  • 3
    So when they called it "Quality Center", they were just being ironic? – Gabe Jul 19 '10 at 02:44
  • Let's just say I'm not a fan of the product - oh for a soap or REST service into the product :) – Bittercoder Jul 19 '10 at 04:09
  • We have hundreds of integration tests executing on OTA / TDApiOle80 for QC 10, with no problems or blocking. So in theory you should be able to do the same. Have you tried resolving the underline problem the blocking itself. May be it's better to understand why it blocks, rather then attempting to go around the problem. – Alex Shnayder Jul 19 '10 at 20:21
  • Have you tried looking at the blocked threads, where they block, and may be even why ? – Alex Shnayder Jul 19 '10 at 20:27
  • Question regarding your process of testing, why do you close connection, dispose and run GC for each test ? Why not have all tests run on same connection ? – Alex Shnayder Jul 21 '10 at 19:47
  • Is there a chance that you test, actually may take more than 30 seconds, it is an integration test, so 30+ seconds may be just what it takes, depending on what you are actually doing. – Alex Shnayder Jul 21 '10 at 19:50
  • Test is taking less then 30 seconds, hang was caused by the way the connection to QC was established, was actually hanging on the call to ConnectProjectEx. I did eventually try pooling the QC connection between tests, but this would cause other issues i.e. "COM object that has been separated from its underlying RCW" errors - I believe this was because the COM object was created on one thread, and then used on multiple threads for subsequent tests. – Bittercoder Jul 27 '10 at 08:29
  • Don't know if this helps, but in out tests, we run all tests on same thread and not multi threaded at all, and we don't have time outs, we just wait for each test to finish (and in most cases it does, almost never had a test that hanged). – Alex Shnayder Jul 29 '10 at 07:11
  • The change in the way we connected fix the problem for us - so life is good here once more. For us the ability to set a timeout per test is important, as the product performs integration between Quality Center and other product's with less then great COM based API's, and we've found they do have hickups every now and then. – Bittercoder Jul 29 '10 at 08:16

1 Answers1

1

You could try running your code on a separate thread, then calling Join on the new thread with a timeout and aborting it if it hits the timeout.

For example:

static readonly TimeSpan Timeout = TimeSpan.FromSeconds(10);
public static void RunWithTimeout(ThreadStart method) {
    var thread = new Thread(method);
    thread.Start();
    if (!thread.Join(Timeout)) {
        thread.Abort();
        Assert.False(true, "Timeout!");
}
SLaks
  • 868,454
  • 176
  • 1,908
  • 1,964
  • I'm afraid that doesn't work - this is what the timeout value in the XUnit Fact attribute is already doing. Manually doing this as well doesn't appear to fix the problem - it appears when a call to the Quality Center library blocks it stops everything - attaching a debugger to the process shows all threads are blocked/awaiting join, so you can can't get a stack trace either (though I haven't tried Intellitrace yet to see if that helps). – Bittercoder Jul 19 '10 at 04:02
  • @Bittercoder: I don't understand how that's possible. Try setting the thread to STA or MTA. – SLaks Jul 19 '10 at 12:45
  • 1
    Make sure your tests are executed on an STA thread, TDConnection does not react well to an MTA thread executing it. – Alex Shnayder Jul 19 '10 at 20:18
  • Thanks for the hint Alex/SLaks, we are now running in an STA thread and it's behaving itself a lot better - though yet to resolve why the tests sometimes hang. – Bittercoder Jul 21 '10 at 02:12