6

I'm improving an application (Win64, C++) by making it more asynchronous. I'm using the Concurrency Runtime and it's worked great for me so far.

The application basically executes a number of 'jobs' transforming data. To track what each job does, certain subsystems are instrumented with code to track certain operations that the job performs. Previously this would use a single global variable representing the currently executing job to be able to register tracking information without having to pass context information all the way down the calling chain. Each job may also turn use the ConcRT to parallelize the job itself. This all works quite well.

Now though, I am refactoring the application so that we can execute the top-level jobs in parallel. Each job is executed as a ConcRT task, and this works well for all jobs except those which need tracking.

What I basically need is a way to associate some context information with a Task, and have that flow to any other tasks spawned by that task. Basically, I need "Task Local" variables.

With ConcRT we can't simply use thread locals to store the context information, since the job may spawn other jobs using ConcRT and these will execute on any number of threads.

My current approach involves creating a number of Scheduler instances at startup, and spawning each job in a scheduler dedicated to that job. I can then use the Concurrency::CurrentScheduler::Id() function to retrieve an integer ID which I can use as a key to figure out the context. This works but single-stepping through the Concurrency::CurrentScheduler::Id() in assembly makes me wince somewhat since it performs multiple virtual function calls and safety checks which adds quite a lot of overhead, which is a bit of a problem since this lookup needs to be done at an extremely high rate in some cases.

So - is there some better way to accomplish this? I would have loved to have a first-class TaskLocal/userdata mechanism which allowed me to associate a single context pointer with the current Scheduler/SchedulerGroup/Task which I could retrieve with very little overhead.

A hook which is called whenever a ConcRT thread grabs a new task would be my ideal, as I could then retrieve the Scheduler/ScheduleGroup ID and store it in a thread local for minimal access overhead. Alas, I can't see any way to register such a hook and it doesn't seem to be possible to implement custom Scheduler classes for PPL/agents (see this article).

1 Answers1

0

Is there some reason that you can't pass some sort of context object to these tasks that gives them an interface for updating their status? Because from where I'm standing, it sounds like you have a really bad problem with Singletons (aka global variables), one that should be solved with dependency injection.

If dependency injection isn't an option, there is another strategy for dealing with Singletons. That strategy is basically allowing the Singleton to be a 'stack'. You can 'push' a new value to the Singleton, and then everybody who accesses it gets this new value. And then you can 'pop' the value back off and the value before pushing is restored. This does not have to be directly modeled with an actual stack, which is why I put the words 'push', 'pop' and 'stack' in quotes.

You can adapt this model to your circumstance by having a thread local Singleton that is initialized with the value (not the whole stack of values, just the top value) of the parent thread's version of this variable. Then, if a new context is required for this thread and its children you can push a new value onto the thread-local Singleton.

Omnifarious
  • 54,333
  • 19
  • 131
  • 194
  • Yes that's the obvious solution, and that's basically what we do for most of our code. For a subset of code however, it's not an option because it would lead to excessive verbosity. Basically, we have code-generated data object classes with setters/getters. Getters can optionally be instrumented to register that they have been used, to track data dependencies. It would be very awkward to have to pass context objects into all these getters and funnel the context down all call chains. Using the Scheduler ID to basically implement scheduler-local variables works though, with good performance. – Stefan Boberg Oct 17 '12 at 21:21
  • @StefanBoberg: There is an alternate model for a Singleton that's not really a Singleton. It allows you to 'stack' values. You can push a new value on to the stack and that becomes the value of the Singleton until the value is popped and the old value restored. You could adapt this to your environment by having a thread local variable that was initialized with the value of the parent thread's instance of this variable. – Omnifarious Oct 18 '12 at 20:52