2

My workflows are hosted in IIS. and each workflow inherits from asynccodeactivity. In BeginExecute, I call command.Beginxxx and in end execute i call EndExecutexxx. I'm using Database Access Block (DAAB).

protected override IAsyncResult BeginExecute(AsyncCodeActivityContext context, AsyncCallback callback, object state)
    {
        DbCommand command = null;
        DbConnection dbConnection = null;
        entlib.Database database;

        try
        {
            database = EnterpriseLibraryContainer.Current.GetInstance<entlib.Database>(DatabaseName.Get(context));
            dbConnection = database.CreateConnection();
            command = dbConnection.CreateCommand();
            command.CommandText = CommandText.Get(context);
            command.CommandType = CommandType.Get(context);

            //have removed few assignments here

            context.UserState = new AsyncDbState(database, command);
        }
        catch (Exception e)
        {
            if (command != null)
                command.Dispose();
            if (dbConnection != null)
                dbConnection.Dispose();

            throw e;
        }

        return (database.Beginxxx(command, callback, state));
    }


    protected override TResult EndExecute(AsyncCodeActivityContext context, IAsyncResult iResult)
    {
        TResult result = default(TResult);

        var userState = context.UserState as AsyncDbState;

            try
            {
                result = (TResult)userState.Database.Endxxx(iResult);
            }
            finally
            {
                if (null != userState && null != userState.Command)
                    userState.Command.Dispose();
            }

        return result;
    }

And sporadically it throws error in event log and terminates entire app pool. After Comments by @Will, I did trap inner exception and noticed the actual error happenes

in BeginExecute of a different activity, which inherits from asyncnativeactivity, I have

 var task = AsyncFactory<IDataReader>.Action(() => ExecuteMdxQuery(connectionStringSettings, mdxQuery, commandTimeout, cancellationToken), cancellationToken);                        

                    return AsyncFactory<IDataReader>.ToBegin(task, callback, state);

and AsyncFactory looks like this 

 public static Task<TResult> Action(Func<TResult> actionMethod,CancellationToken token)
    {
        TaskFactory factory = new TaskFactory();
        //TaskFactory factory = new TaskFactory(scheduler);
        return factory.StartNew<TResult>(() => actionMethod(), token);   
        }
  public static IAsyncResult ToBegin(Task<TResult> task, AsyncCallback callback, object state)
    {
        var tcs = new TaskCompletionSource<TResult>(state);
        var continuationTask = task.ContinueWith(t =>
        {
            if (task.IsFaulted)
            {
                tcs.TrySetException(task.Exception.InnerExceptions);
            }
            else if (task.IsCanceled)
            {
                tcs.TrySetCanceled();
            }
            else
            {
                tcs.TrySetResult(task.Result);
            }

An unhandled exception occurred and the process was terminated.

An unhandled exception occurred and the process was terminated.

Application ID: /LM/W3SVC/1/ROOT/workflowservice

Process ID: 7140

Exception: System.AggregateException

Message: A Task's exception(s) were not observed either by Waiting on the Task or accessing its Exception property. As a result, the unobserved exception was rethrown by the finalizer thread.

StackTrace: at System.Threading.Tasks.TaskExceptionHolder.Finalize()

InnerException: Microsoft.AnalysisServices.AdomdClient.AdomdErrorResponseException

Message: Server: The current operation was cancelled because another operation in the transaction failed.

StackTrace: at Microsoft.AnalysisServices.AdomdClient.AdomdConnection.XmlaClientProvider.Microsoft.AnalysisServices.AdomdClient.IExecuteProvider.ExecuteTabular(CommandBehavior behavior, ICommandContentProvider contentProvider, AdomdPropertyCollection commandProperties, IDataParameterCollection parameters) at Microsoft.AnalysisServices.AdomdClient.AdomdCommand.ExecuteReader(CommandBehavior behavior) at WorkflowActivity.AsyncExecuteSafeReader.ExecuteMdxQuery(String connectionStringName, String mdxQuery, Nullable1 commandTimeout, CancellationToken cancellationToken) in d:\B\69\Sources\Infrastructure\WorkflowActivity\AsyncExecuteSafeReader.cs:line 222 at AsyncExecuteSafeReader.ExecuteMdxQuery(String connectionStringName, String mdxQuery, Nullable1 commandTimeout, CancellationToken cancellationToken) in d:\B\69\Sources\Infrastructure\WorkflowActivity\AsyncExecuteSafeReader.cs:line 239 at WorkflowActivity.AsyncExecuteSafeReader.<>c__DisplayClassd.b__a() in d:\B\69\Sources\Infrastructure\WorkflowActivity\AsyncExecuteSafeReader.cs:line 180 at System.Threading.Tasks.Task`1.InvokeFuture(Object futureAsObj) at System.Threading.Tasks.Task.Execute()

KaSh
  • 175
  • 1
  • 11
  • 1
    You're using thread-unsafe code across different threads. I'd suspect it is centered here `EnterpriseLibraryContainer.Current.GetInstance` if that doesn't store the instance per-thread. You'd have to check the code or the docs. Anyhow, within the scope of the method you should use only `new` instances rather than class-scoped variables. –  Aug 28 '14 at 15:49
  • Thank you. I'll verify about getinstance. but within beginexecute, I'm using new instances of variables right? or am I misunderstanding? dbcommand, connection and database objects? – KaSh Aug 28 '14 at 16:11
  • 1
    As long as you're using the `new` keyword, yes. –  Aug 28 '14 at 16:45
  • Which I'm not. will change and do the testing. Thank you. – KaSh Aug 28 '14 at 19:35
  • EnterpriseLibraryContainer.Current.GetInstance is thread safe @Will. The objects created out of it apparently are not. I'm using this. dbConnection = database.CreateConnection(); All the variables inside begin execute are abstract types. to which i'm assigning new instances using database.createcommand for instance. can you help me? – KaSh Aug 29 '14 at 07:59
  • Just to add, I checked the code of database. it has parametercache code. but i'm not clearing it. so I'm presuming i'm ok ? – KaSh Aug 29 '14 at 08:09
  • Hmmm, looking at the callstack it appears that the object being shared appears to be a Task. But the callstack is truncated. You need to get better info on the exception thrown. Calling ToString() on the object within the catch block will give you your answer, probably. Or, perhaps, just delete that catch block and move your disposables into a [`using()` statement](http://msdn.microsoft.com/en-us/library/yh598w02.aspx) or a final block. –  Aug 29 '14 at 13:47
  • Thanks @Will. I cannot use finally block in beginexecute can I? as I need end execute to be called? and I'm trying to dispose in finally block of endexecute. However, I didn't understand "calling tostring()". you mean on exception? I changed the throw e to just throw now. testing if i can get more exception details – KaSh Aug 29 '14 at 15:45
  • 1
    Call ToString on the exception gets every bit of useful information out of it, including *inner exceptions*. You can absolutely use a finally block in there--simply remove the catch and add a finally block. Your change will help prevent stack trace truncation. –  Aug 29 '14 at 15:48
  • yup! just before i saw your reply, i realised what you meant. have done that. and now waiting with baited breath! thank you for being on top of this. I really appreciate it. As soon as i get the error, will post here – KaSh Aug 29 '14 at 16:08
  • @Will, I managed to reproduce.I tried copying error here but exceeede char limits.reader passed to sql bulkcopy.This has thrown the erorr: Exception: System.AggregateException Message: A Task's exception(s) were not observed either by Waiting on the Task or accessing its Exception property. As a result, the unobserved exception was rethrown by the finalizer thread. StackTrace: at System.Threading.Tasks.TaskExceptionHolder.Finalize(). Will fix it now. a Big thank you – KaSh Aug 30 '14 at 05:03
  • @Will, Trying to mark your Comment as answer but can't find the tick. – KaSh Aug 30 '14 at 05:04
  • Composing an answer. –  Sep 01 '14 at 16:32

2 Answers2

2

The first hint is that this is happening in IIS. While it's usually clear in an application when you're possibly hitting issues caused by multithreading, it isn't so in IIS.

Every request in IIS is serviced by a different thread. Any shared instances are going to be hit by multiple threads. That is often bad news if you're not expecting it.

So my first guess (had to guess because your exception's call stack was cut off; more on that later) was that you're using thread-unsafe code across different threads. I suspected it was centered here EnterpriseLibraryContainer.Current.GetInstance becasue, if that doesn't store the instance per-thread, it'll share the same instance between threads. You'd have to check the code or the docs. Easiest way to test that is to use "make object ID" in the watch window, then compare the results from EnterpriseLibraryContainer.Current.GetInstance within two different threads.

What was clear was that your exception was getting lost because you were re-throwing the exception rather than letting it go. For more information on best practices in this situation, see this answer.

Re-examining the call stack, it still appeared to be a multithreading bug, however it appeared that the problem was that multiple threads were attempting to complete execution of two different Tasks.

Message: The operation completed.
StackTrace: at System.Activities.AsyncOperationContext.ShouldComplete()
(snip)

Something somewhere is attempting to complete execution of the Task but it's already complete. As in, one thread beat another completing the asynchronous operation.

At this point, it was not possible to tell what the actual problem was and where it was happening without the full stack trace of the exception. The best way to get this information is to catch the exception at the first chance within your code and call ToString() on it, or use the "copy to clipboard" link on the exception helper dialog (does the same thing, copies it to the clipboard). This is important because you get the following information

  1. Exception type
  2. Exception message
  3. Stack trace

NOT ONLY for the exception you caught, but for every .InnerException wrapped by this exception!` Often times that's where your real problem is hidden.

And, in this case, when you did that you were able to identify where your code was experiencing reentrancy issues.

Community
  • 1
  • 1
  • Thanks Will. I was not able to paste entire error. I've done both tostring and throw. and have the errors. you're right about another task trying to complete something which is already complete. finding this out has become a nightmare. i'm using ASyncNativeactivity of codeplex and inheriting that to do async operations against db.lot of errors seem to be prompting from the class used (which is inherriting from asyncnative). my problem is, i'm not able to paste more than certain characters here. but Will try again your advise and comment here asap – KaSh Sep 03 '14 at 07:50
  • @kavya you can edit your question to replace the exception you currently have with the entire thing. –  Sep 03 '14 at 13:53
  • super. I've done that. there're multiple places where I get error. one of the place is what I changed above. TOBegin has continuewith. So to my understanding, it shouldn't have given the error i just edited? – KaSh Sep 03 '14 at 14:28
  • it looks like everywhere we have used AsyncFactory.action followed by AsynFactory.tobegin has problem. – KaSh Sep 03 '14 at 15:41
  • @Kavya the exception message makes it appear that it's the operation being performed against the database that is the problem. You should use Profiler to watch execution against the database and see what queries are erroring, then troubleshoot them outside of the workflow. –  Sep 04 '14 at 12:43
  • yes @Will. that's correct. But point being, if there's any error while executing a db command, it should kill that workflowinstance right? why App pool? entire app pool is torn down because there's unhandled exception. this is our key problem. handling this so that app pool is not torn down. any suggestion? – KaSh Sep 04 '14 at 14:10
  • 1
    @Kavya Because the exception isn't caught. If an exception isn't caught, the entire application is exited. That's standard behavior, intentionally that way, to ensure execution doesn't continue with the application in an invalid state. The only way to prevent this is to find somewhere in code where you can capture that exception and either dump it or log it and recover if you know how to do so. –  Sep 04 '14 at 15:38
  • you're absolutely right. but haven't been able to figure out what exactly was causing unhandled exception. but my colleague helped with this, I have some more Data. at 20:01:35, an invalidoperation exception is thrown. (this was caught when executing the callback). at 20:01:37 tearing down of app pool begins. now the interesting part. at 20:01 33, (5s prior to exception), the activity which caused this exception is already closed(app fabric logging). so why is end exec called before call back? I hope i'm not confusing. was tkaing time to reproduce and log the error. hence couldn't respond soon – KaSh Sep 08 '14 at 13:30
  • @Kavya huh... That is really weird. Looks like the issue is that your callbacks are liable to execute after execution has completed, which results in the exception. Can you rip out the async stuff and do your database work synchronously? It reminds me of the famous situ where people playing with async stuff wonder why it doesn't execute before the console app closes. If you can't go synchronous you may have to block the execution thread with a Mutex or similar until the callback completes... –  Sep 08 '14 at 17:10
  • won't be able to do synchronous as they are really really long running workflows. But now have 2 options. 1. In the catch block of call back (of invalid operation ex) just terminate that particular workflow. This way because ex is Handled app pool won't tear down.(think so) 2. I came across another article which says set Mars = false in conn string. We don't need mars. Although theoretically this doesn't mKe sense, ever since I've made it false, I've not been able to replicate the behavior.Last 5 workflows have thrown error in beginexecnonquery. Returns rowsaffected – KaSh Sep 08 '14 at 21:20
  • Although I'm bit obsessed about finding out why call back is called after end execute. It just doesn't make sense – KaSh Sep 08 '14 at 21:22
  • also got rid of entlib and now just using sqlcommand. Yet not there – KaSh Sep 08 '14 at 21:23
0

@Will, Looks like Mars = false in connection string has resolved it. haven't been able to replicate it. theory is - a query returned result. and end execute was called. but it also returned another resultset? which is why call back was being invoked. but end execute was already called by then. But then again, it's sporadic. If this theory is true, my understanding is that it could have failed all the time. So far haven't been able to crash at all. there were also few procedures which had rowcount on. I'm grateful you took time to comment and share your theory on this. learnt a lot.

KaSh
  • 175
  • 1
  • 11