3

We've got a long-running process that is initiated by a web request. In order to give the process time to complete, we spin it off on a new thread and use a Mutex to ensure only one instance of the process can run. This code runs as intended in our development and staging environments, but is failing in our production environment with Null Reference Exception. Our application logging does not capture anything and our operations folks are reporting that it is crashing the AppPool. (It would seem to be an environmental problem, but we have to proceed with the assumption that the environments are configured identically.) We have so far been unable to determine where the Null Reference is.

Here is the exception from the Application Event Log:

Exception: System.NullReferenceException
Message: Object reference not set to an instance of an object.
StackTrace:    at Jobs.LongRunningJob.DoWork()
   at System.Threading.ExecutionContext.runTryCode(Object userData)
   at System.Runtime.CompilerServices.RuntimeHelpers.ExecuteCodeWithGuaranteedCleanup(TryCode code, CleanupCode backoutCode, Object userData)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean ignoreSyncCtx)
   at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.ThreadHelper.ThreadStart()

And here is the code (slightly sanitized):

public class LongRunningJob: Job
{
    private static Mutex mutex = new Mutex();

    protected override void PerformRunJob()
    {
        var ts = new ThreadStart(LongRunningJob.DoWork);
        var thd = new Thread(ts);
        thd.IsBackground = true;
        thd.Start();
    }

    private static void DoWork()
    {
        var commandTimeOut = 180;

        var from = DateTime.Now.AddHours(-24);
        var to = DateTime.Now;

        if (mutex.WaitOne(TimeSpan.Zero))
        {
            try
            {
               DoSomethingExternal(); // from what we can tell, this is never called
            }
            catch (SqlException sqlEx)
            {
                if (sqlEx.InnerException.Message.Contains("timeout period elapsed"))
                {
                    Logging.LogException(String.Format("Command timeout in LongRunningJob: CommandTimeout: {0}", commandTimeOut), sqlEx);
                }
                else
                {
                    Logging.LogException(String.Format("SQL exception in LongRunningJob: {0}", sqlEx.InnerException.Message), sqlEx);
                }
            }
            catch (Exception ex)
            {
                Logging.LogException(String.Format("Error processing data in LongRunningJob: {0}", ex.InnerException.Message), ex);
            }
            finally
            {
                mutex.ReleaseMutex();
            }
        }
        else
        {
            Logging.LogMessage("LongRunningJob is already running.");
        }
    }
}
The Rover
  • 79
  • 9
  • 4
    Please don't do long running processes in an environment not suited to them, this is only the beginning of what will eventually be tearjerking: http://stackoverflow.com/a/5553048/263681 – Grant Thomas Nov 05 '12 at 15:33
  • Thanks Grant. I couldn't agree more. – The Rover Nov 05 '12 at 15:43
  • I've seen ASP.NET using the .NET threadpool successfully before. Here's how to start a request and catch any exceptions it may throw: http://stackoverflow.com/a/753855/1429439 – C.M. Nov 05 '12 at 15:48
  • It turns out the Null Reference was in our exception logging, and it was masking the underlying error. The first exception was in the DoSomethingExternal() method, where we were attempting to truncate a replicated table. This is why it succeeded in all but the production environment. – The Rover Nov 05 '12 at 19:24
  • Why not just log ex.ToString()? – John Saunders Nov 05 '12 at 19:58

1 Answers1

5

In order to find a NullReferenceException you basically examine every dereference operation. I can see only the following suspicious one:

ex.InnerException.Message

You can't assume ex.InnerException is not null.

usr
  • 168,620
  • 35
  • 240
  • 369
  • 1
    usr - I believe you may be onto something. Possibly our exception logging is throwing it's own exception which is masking the original issue. We will be exploring this option. – The Rover Nov 05 '12 at 17:24