I have a number of worker threads which work on tasks called class Task
. I'm using C++ on x86_64 Windows/Mac/Linux. While working on each task, I can update a global Task* activeTasks[]
array so that the application knows when each worker is processing which task.
activeTasks[threadId] = &task;
task.work();
activeTasks[threadId] = NULL;
I would like to write a simple in-application profiler so that the user can see how many microseconds have been spent on each task.
An additional complication is that tasks may call sleep()
inside their work()
function. The profiler should only sample a task when the thread is active, not sleeping or suspended by the scheduler.
How could a profiler like this be implemented?
It seems this is exactly how profilers like perf
work, except they inspect the current call stack rather than an activeTasks
array.
Attempts
A naive idea is to launch a separate profiler thread that periodically checks activeTasks[threadId]
every few microseconds, incrementing a counter for each task if the worker thread is in a running state. This can be checked with thread.ExecutionState == 3
on Windows and possibly some way with pthreads.
But the issue is that on single-core machines, running a profiler thread and any of the worker threads is never simultaneous, so the profiler would always see the worker as "suspended".
Another idea is to trigger some sort of interrupt, but I have no idea how this is done.