0

I'm trying to profile lag in my framework; since I use MinGW, gprof doesn't work with DLLs (for me it even gave me garbage information like an initialization function running thousands of times instead of just once), and gperftools's profiler isn't supported on Windows (yet), I tried rolling my own profiling code inspired by Cygwin's:

// 10 may 2015
#include "uipriv_windows.h"

static FILE *fprof = NULL;

static DWORD WINAPI profilerThread(LPVOID th)
{
    HANDLE thread = (HANDLE) th;
    LARGE_INTEGER counter;
    CONTEXT ctxt;

    // TODO check for errors
    if (SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_TIME_CRITICAL) == 0)
        complain("error setting thread priority in profilerThread()");
    for (;;) {
        if (SuspendThread(thread) == (DWORD) (-1))
            complain("error suspending thread in profilerThread()");
        QueryPerformanceCounter(&counter);
        ZeroMemory(&ctxt, sizeof (CONTEXT));
        ctxt.ContextFlags = CONTEXT_CONTROL;
        if (GetThreadContext(thread, &ctxt) == 0)
            complain("error getting thread context in profilerThread()");
        fprintf(fprof, "%I64X %I64d\n",
            (DWORD64) (ctxt.Eip),
            counter.QuadPart);
        fflush(fprof);
        if (ResumeThread(thread) == (DWORD) (-1))
            complain("error resuming thread in profilerThread()");
        Sleep(100);
    }
    return 0;
}

void initprofiler(HANDLE thread)
{
    fprof = fopen("profiler.out", "w");
    if (fprof == NULL) {
        fprintf(stderr, "error opening profiler output file\n");
        abort();
    }
    if (CreateThread(NULL, 0, profilerThread, thread, 0, NULL) == NULL)
        complain("error creating profiler thread");
}

However, the profiles this returns are useless:

F77B0C03 3571425665428
F77B0C03 3571426671982
F77B0C03 3571427677119
F77B0C03 3571428683227
F77B0C03 3571429689442
F77B0C03 3571430696476
F77B0C03 3571431702590
F77B0C03 3571432708622

This particular value is on wine, which redirects to __kernel_vsyscall+0x3. Real Windows has something like 7C90E514 instead, which redirects to ntdll!KeFastSystemCallRet.

I'm guessing (given a wine stack trace) this is because it's stuck in GetMessage().

If I change the sleep duration from 100 to 1, I get more meaningful values occasionally.

Is there something I'm missing? Is there a better profiling option, or do I have this fundamentally wrong in some way?

Thanks.

Community
  • 1
  • 1
andlabs
  • 11,290
  • 1
  • 31
  • 52
  • Don't. Don't. Don't. Do [*this*](http://stackoverflow.com/a/378024/23771) instead. First do it with -O0 so you can see what's going on. When you've cleaned out the crud, then set -O3. – Mike Dunlavey May 11 '15 at 12:31
  • Thanks for that suggestion. Unfortunately the lag happens in places where I'm interacting with the GUI: best example is dragging the window borders. I'm not sure how I can stop the program repeatedly while doing this, if that's even possible. The lag is only two or three seconds on top of that. If you have any suggestions, that would be great. All of the profilers listed on that page are Linux-only. If there's a good profiler that works on Windows and isn't confused by DLLs, I'd like to know. I will say I'm not compiling with any -O options. – andlabs May 11 '15 at 13:00
  • 1
    You are just discovering a simple fact, most GUI programs spend 99.9% of their lifetime waiting for Windows to tell it that something interesting happened. You can't effectively profile them with sampling, you need an "instrumentation" mode that detects when a function is entered and returns. Much harder to implement. – Hans Passant May 11 '15 at 13:33
  • @andlabs: What I do is run the UI under the IDE debugger, with the UI in partial-screen mode so I can quickly reach the IDE's pause button. What I need to do is pause it while it's in the process of being subjectively slow. In your case I suspect the problem is in DoPaint. I would go in that routine and give it an outer loop so that instead of painting just once, it would do it 100 times. Then my pause would be sure to hit it. Do this several times, find the problem, fix it, and then remove the loop. Good luck! – Mike Dunlavey May 11 '15 at 14:02
  • @andlabs: Since you're running MinGW, I use plain-old GDB. To pause, I hit Ctl-C in the debug window to interrupt it, say `thread 1` to get to the main thread, and then `bt` to show the stack. It's a little clumsy, but it finds the problem in just a few samples. – Mike Dunlavey May 11 '15 at 14:42
  • You could make your life easier, and use Visual Studio instead. In particular, the [Concurrency Visualizer](https://msdn.microsoft.com/en-us/library/dd537632.aspx) is extremely helpful in investigating lags. It makes it almost impossible to miss the root cause. – IInspectable May 11 '15 at 15:22
  • 1
    All right, thanks everyone. I've started performance tuning with instrumented call graphs for now. As for Visual Studio, that is on the table, but not at present. Eventually I want to have this library built with both toolchains, so it will be used. – andlabs May 11 '15 at 17:26

0 Answers0