I'm writing a profiler that queries a timer whenever a function enters or exits. So it's possible that it's queried thousands of times a second.
Initially I used QueryPerformanceCounter, despite the fact it's high resolution, it turned out to be quite slow. According to What happens when QueryPerformanceCounter is called? question I also got a noticeable slowdown when I use QPC in the profiler, but probably not that bad 1-2ms figure. If I replace it with GetTickCount I don't notice any slowdown, but that function is inaccurate for the profiling.
The mentioned question mention affinity masks. I tried to use SetProcessAffinityMask(GetCurrentProcess(), 1)
to bind it but it doesn't improve the performance at all.
I don't know whether it matters or not, but so far I tested it on Windows that runs in VirtualBox on a Linux host. Could it be the problem?