3

I am trying to write a simple game loop for MacOS. The event polling solution I found on libraries like GLFW, SDL, and MacOS game ports for DOOM, Quake, Handmade Hero is to use the nextEventMatchingMask API. I am interested in the latency of event polling:

t0 = mach_absolute_time();
NSEvent *event = [NSApp nextEventMatchingMask:NSEventMaskAny
                                    untilDate:nil
                                       inMode:NSDefaultRunLoopMode
                                      dequeue:YES];
t1 = mach_absolute_time();
latency=t1-t0

The experiment I run (source code) consists of opening a Cocoa MacOS window and "randomly" generating mouse and keyboard events. The typical latency numbers I get for the first 1000 events in such experiment looks like this

typical latency

# latency percentiles in milliseconds
:       0%        5%       10%       15%       20%       25%       30%       35%
: 0.033830  0.047655  0.060302  0.071263  0.073450  0.075871  0.079204  0.083330
:      40%       45%       50%       55%       60%       65%       70%       75%
: 0.093038  0.107813  0.134466  0.143095  0.180434  0.334789  0.448111  0.500118
:      80%       85%       90%       95%      100%
: 0.524535  0.583159  0.648065  0.719931 36.567810

Note that the top 25% latencies are more that half a millisecond. Even when there is no event available (zeros on the chart) we can get a latency of more than a millisecond (see zero floating around near the 600th event). To make things worse, this is the typical case I observe when my Macbook has not much going on: just a Terminal and the test program. It gets worse when more applications are running.

I wonder if there is a more efficient way to the get the next (mouse/keyboard) event if available in a MacOS application. Is there a trick I am missing that makes nextEventMatchingMask calls more efficient?

The source code to run this test and generate a plot like the one above can be found here: https://github.com/laurolins/cocoa_poll_events_latency

(update) Following an idea Dad suggested in the comments, I ran the test without any mouse/keyboard movement. Here are the latency percentiles:

# latency without any keyboard/mouse event in milliseconds
:         0%          5%         10%         15%         20%         25%
: 0.04029500  0.07617675  0.07908070  0.08556100  0.10188940  0.12956500
:        30%         35%         40%         45%         50%         55%
: 0.13832200  0.14023150  0.14802140  0.15003550  0.15045250  0.15088950
:        60%         65%         70%         75%         80%         85%
: 0.15125000  0.15162535  0.15190300  0.15235550  0.15313200  0.15972645
:        90%         95%        100%
: 0.16802250  0.21194335 35.93352900

The idea of waiting .15 milliseconds to find out that no event was generated does not feel right!

(update 2) Just timed all SDL_EventPoll and MessagePump calls on gzdoom running on ubuntu and the latency distribution I am getting looks much better:

 #       0%         5%        10%        15%        20%        25%        30%
 :0.00095100 0.00186075 0.00215500 0.00226400 0.00234180 0.00241500 0.00255900
 :       35%        40%        45%        50%        55%        60%        65%
 :0.00289495 0.00652280 0.00712510 0.00749900 0.00770290 0.00807660 0.00888870
 :       70%        75%        80%        85%        90%        95%       100%
 :0.00897200 0.00905000 0.00931060 0.01374225 0.01728800 0.03256290 0.23016000

while on MacOS gzdoom also struggles with the slowness of nextEventMatchingMask (latency numbers on the hundreds of microseconds). Conclusion: it is not only my test code that is slow to get the next keyboard/mouse event on MacOS.

llins
  • 75
  • 4
  • 1
    You might be able to eke out a tiny bit of extra performance by caching the `IMP` for the `nextEventMatchingMask:untilDate:inMode:dequeue:` method and calling that directly, eliminating an `objc_msgSend` call. – Charles Srstka Nov 26 '18 at 23:11
  • 2
    You might want to use a custom run loop mode (just a string that's not likely to be used by the system frameworks) instead of `NSDefaultRunLoopMode`. The latter allows other input sources and timers to fire. And, just to be sure, see if [disabling App Nap](https://stackoverflow.com/questions/50512279/prevent-nstimer-firing-delays-in-background-app/50531545#50531545) makes a difference. – Ken Thomases Nov 27 '18 at 00:09
  • 1
    Tried setting the mask to just the events you care about? – Dad Nov 27 '18 at 00:54
  • Updated the tests [link](https://github.com/laurolins/cocoa_poll_events_latency/blob/master/main.m) to mask events of interest, set a custom run loop mode, and disable AppNap. The latency distribution seem to be at the same ballpark. My expectation might be off here, but I was expecting much smaller latencies. I was thinking in terms of the `PeekMessage` API call on windows ([PeekMessage Profiling by Casey Muratori on Handmade Hero](https://youtu.be/qYGNxUu6tjI?t=4189)). – llins Nov 27 '18 at 18:52
  • Might consider not calling `sendEvent:` for events you'll be handling. It does a bunch of dispatching (though you don't seem to have any windows to dispatch to so not sure if it's an issue). Run this under Instruments in the time profiler to see where time is being spent in more detail. – Dad Nov 28 '18 at 00:19
  • I'm getting system beeps when I hit random keys during your test app runs. Be funny if playing that beep was what was making it so slow… :) – Dad Nov 28 '18 at 00:22
  • 1
    I show 119ms total spent in `nextEventMatchingEventMask:until:Date:inMode:dequeue` when I don't hit that many key strokes but do click and move the mouse. If I do zero events (no mouse moves, clicks, or keyboard activity) I see: – Dad Nov 28 '18 at 00:28
  • > 108.00 ms 100.0% 0 s main (16053) > 104.00 ms 96.2% 0 s Main Thread 0x254514 > 88.00 ms 81.4% 0 s start > 88.00 ms 81.4% 0 s main > 36.00 ms 33.3% 2.00 ms -[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:] > 18.00 ms 16.6% 0 s +[NSApplication sharedApplication] > 17.00 ms 15.7% 1.00 ms usleep > 10.00 ms 9.2% 0 s -[NSApplication _setPresentationOptions:instance:flags:] – Dad Nov 28 '18 at 00:29
  • @Dad, I liked your idea of testing without generating any keyboard/mouse events. I will update the main text with those numbers. – llins Nov 28 '18 at 03:53
  • @Dad, notice that the latency I am interested is only with respect to the `nextEventMatchingMask` (see the `latency=t1-t0` in the main text above) the handling of the event `sendEvent:` is outside the scope I am measuring. Why what happening inside `nextEventMatchingMask` is taking so long? .15ms even when `nil` is returned... – llins Nov 28 '18 at 16:51
  • Have you profiled where the time is being spent inside `nextEventMatchingMask:...`? That might give some clue to a means to minimize it. – Ken Thomases Jan 08 '19 at 19:18
  • Here is a call stack that shows where nextEventMatchingMask is spending time: [call stack](https://github.com/laurolins/cocoa_poll_events_latency/blob/master/trace_nextEventMatchingMask.txt). I wonder why the call for `SLSGetNextEventRecordInternal` called from `SLEventCreateNextEvent` is so expensive... – llins Jan 10 '19 at 15:15

0 Answers0