2

I'm writing a profiler that queries a timer whenever a function enters or exits. So it's possible that it's queried thousands of times a second.

Initially I used QueryPerformanceCounter, despite the fact it's high resolution, it turned out to be quite slow. According to What happens when QueryPerformanceCounter is called? question I also got a noticeable slowdown when I use QPC in the profiler, but probably not that bad 1-2ms figure. If I replace it with GetTickCount I don't notice any slowdown, but that function is inaccurate for the profiling.

The mentioned question mention affinity masks. I tried to use SetProcessAffinityMask(GetCurrentProcess(), 1) to bind it but it doesn't improve the performance at all.

I don't know whether it matters or not, but so far I tested it on Windows that runs in VirtualBox on a Linux host. Could it be the problem?

Community
  • 1
  • 1
Calmarius
  • 18,570
  • 18
  • 110
  • 157
  • 1
    [This article](http://msdn.microsoft.com/en-us/library/windows/desktop/dn553408.aspx) looks interesting – Lucas Trzesniewski Dec 31 '14 at 01:38
  • Oh, and yes, virtualization can be a problem for code that uses low level features such as... timers that use CPU registers for instance ;-) – Lucas Trzesniewski Dec 31 '14 at 01:43
  • 2
    The RDTSC instruction is usable again on modern processors. Also what QueryThreadCycleTime() uses. Keep in mind that converting "cycles" to time isn't straight-forward. – Hans Passant Dec 31 '14 at 12:08
  • @HansPassant That QueryThreadCycleTime is available from Vista and higher. My VM is still XP for performance reasons, it's not available here. So I cannot test it. I ended up invoking the rdtsc instruction directly, it worked. I don't need to convert cycles to time. It's enough if I can find the function that burns the most cycles. – Calmarius Dec 31 '14 at 15:55
  • 1
    Targeting a VM for a profiler is a bit like building a submarine in your basement. You'll probably get it done but there isn't anyway to get it up the stairs to make it useful. – Hans Passant Dec 31 '14 at 16:30

2 Answers2

0

The highest resolution timers I'm aware of on Windows are the multimedia timers available via winmm.dll

Here's a class I've had lying around for my own perf testing needs - give 'er a whirl:

public class HighResTimer
{
    private delegate void TimerEventHandler(int id, int msg, IntPtr user, int dw1, int dw2);

    private const int TIME_PERIODIC = 1;
    private const int EVENT_TYPE = TIME_PERIODIC;

    [System.Runtime.InteropServices.DllImport("winmm.dll")]
    private static extern int timeSetEvent( int delay, int resolution, TimerEventHandler handler, IntPtr user, int eventType);  
    [System.Runtime.InteropServices.DllImport("winmm.dll")]
    private static extern int timeKillEvent(int id);
    [System.Runtime.InteropServices.DllImport("winmm.dll")]
    private static extern int timeBeginPeriod(int msec);
    [System.Runtime.InteropServices.DllImport("winmm.dll")]
    private static extern int timeEndPeriod(int msec); 

    private int _timerId;
    private TimerEventHandler _handler = delegate {};

    public event EventHandler OnTick;

    public HighResTimer(int delayInMs)
    {
        timeBeginPeriod(1);
        _handler = new TimerEventHandler(timerElapsed);
        _timerId = timeSetEvent(delayInMs, 0, _handler, IntPtr.Zero, EVENT_TYPE);
    }

    public void Stop()
    {
        int res = timeKillEvent(_timerId);
        timeEndPeriod(1);
        _timerId = 0;
    }

    private void timerElapsed(int id, int msg, IntPtr user, int dw1, int dw2)
    {
        OnTick(this, new EventArgs());
    }   
}
JerKimball
  • 16,584
  • 3
  • 43
  • 55
  • 3
    Just keep in mind that a multimedia timer runs in its own thread, and there are [limitations on what you are allowed to use in the timer's callback](http://msdn.microsoft.com/en-us/library/dd757631.aspx). That being said, `timeSetEvent()` has been replaced with [`CreateTimerQueueTimer()`](http://msdn.microsoft.com/en-us/library/ms682485.aspx). – Remy Lebeau Dec 31 '14 at 01:38
  • 1
    Thread pool timers, timer queue timers, and the Waitable timer all have the same millisecond resolution as multimedia timers. As the previous comment stated, you should use one of those rather than the multimedia timers, which have been considered obsolete for several years. More importantly, your timer gives at best 1 ms resolution. `QueryPerformanceCounter` can provide close to *micro*second resolution. It's not a periodic timer, of course, but the OP is using it for counting elapsed time rather than for executing code on a periodic basis. – Jim Mischel Dec 31 '14 at 15:09
  • Knew about the deprecated status of `winmm`, did not know about the alternatives - cheers @JimMischel @RemyLebeau – JerKimball Dec 31 '14 at 16:30
0

Ended up using the RDTSC instruction directly. So I wrote a wrapper for it in GCC:

static inline unsigned long long rdtsc(void)
{
    unsigned hi, lo;
    asm volatile ("rdtsc" : "=a"(lo), "=d"(hi));
    return ( (unsigned long long)lo)|( ((unsigned long long)hi)<<32 );
}

No slowdowns and apparently have quite higher resolution than QueryPerformanceCounter.

The code based on this answer.

Community
  • 1
  • 1
Calmarius
  • 18,570
  • 18
  • 110
  • 157
  • I don't think that will work very well in a system with multiple cpu's. your thread may scheduled to a different core or cpu, cores or cpu's may be powered down to save power etc. – Willem Hengeveld Aug 27 '22 at 08:45