In our software we extensively use MDC to track things like session IDs and user names for web requests. This works fine while running in the original thread.
However, there's a lot of things that need to be processed in the background. For that we use the java.concurrent.ThreadPoolExecutor
and java.util.Timer
classes along with some self-rolled async execution services. All these services manage their own thread pool.
This is what Logback's manual has to say about using MDC in such an environment:
A copy of the mapped diagnostic context can not always be inherited by worker threads from the initiating thread. This is the case when java.util.concurrent.Executors is used for thread management. For instance, newCachedThreadPool method creates a ThreadPoolExecutor and like other thread pooling code, it has intricate thread creation logic.
In such cases, it is recommended that MDC.getCopyOfContextMap() is invoked on the original (master) thread before submitting a task to the executor. When the task runs, as its first action, it should invoke MDC.setContextMapValues() to associate the stored copy of the original MDC values with the new Executor managed thread.
This would be fine, but it is a very easy to forget adding those calls, and there is no easy way to recognize the problem until it is too late. The only sign with Log4j is that you get missing MDC info in the logs, and with Logback you get stale MDC info (since the thread in the tread pool inherits its MDC from the first task that was ran on it). Both are serious problems in a production system.
I don't see our situation special in any way, yet I could not find much about this problem on the web. Apparently, this is not something that many people bump up against, so there must be a way to avoid it. What are we doing wrong here?