How do we use sleep() in Linux to keep our CPU usage reasonable while still having decent timing accuracy?

Question

The Problem

I'm trying to test a system that uses UDP packets to communicate at a predetermined rate. I want to be able to test this system using a Python test harness with a set packet rate. Sample rates might be 20 packets/sec, or 4500 packets/sec, etc.

In some simple tests I've determined that my Windows machine can pass upwards of 150,000 UDP packets per second over localhost, so I can treat that as an upper limit for the sake of the experiment.

Let's start with this shell structure to create a rate limiter. This code is inspired mostly by code in this thread.

Approach 1

import time, timeit

class RateLimiter:

    def __init__(self, rate_limit):
        self.min_interval = 1.0 / float(rate_limit)
        self.last_time_called = None

    def execute(self, func, *args, **kwargs):
        if self.last_time_called is not None:
            # Sleep until we should wake up
            while True:
                now = timeit.default_timer()
                elapsed = now - self.last_time_called
                left_to_wait = self.min_interval - elapsed

                if left_to_wait <= 0:
                    break

                time.sleep(left_to_wait)

        self.last_time_called = timeit.default_timer()
        return func(*args, **kwargs)

You can use this helper class like so:

self._limiter = RateLimiter(4500)   # 4500 executions/sec

while True:
    self._limiter.execute(do_stuff, param1, param2)

The call to timeit.default_timer() is a shortcut in Python that gives you the highest accuracy timer for your platform, lending an accuracy of about 1e-6 seconds on both Windows and Linux, which we will need.

Performance of Approach 1

In this approach, sleep() can buy you time without eating CPU cycles, but it can hurt the accuracy of your delay. This comment shows the differences between Windows and Linux regarding sleep() for periods less than 10ms. In summary of that comment, Windows' sleep() only works for values of 1ms or more (any less is regarded as zero) but generally sleeps for less than the requested sleep time, while in Linux sleep() is more precise but generally sleeps for slightly more than the requested time.

The code above is accurate on my Windows machine, but is inefficient for faster rates. When I requested a rate of 4500 packets/sec in my tests, I got a median of 4466 packets/sec (0.75% error). However, for rates faster than 1000Hz, the calls to sleep() take zero time, so the RateLimiter burns CPU cycles until exceeding the wait time. Unfortunately we have no other choice since we can't use non-zero sleep times less than 1ms in Windows.

In Linux, the calls to sleep() took longer than requested, yielding a median of 3470 packets/sec (22.8% error). While sleep() in Linux takes longer than desired, requesting higher rates like 6000Hz yields a true rate higher than 4500, so we know that it's capable of the goal rate. The problem is in our sleep() value, which must be corrected to be lower than we might have expected. I performed another test, using the following (bad) approach.

Approach 2

In this approach, we never sleep. We chew up CPU cycles until the time elapses, which leads Python to use 100% of the core it's running on:

def execute(self, func, *args, **kwargs):
    if self.last_time_called is not None:
        # Sleep until we should wake up
        while True:
            now = timeit.default_timer()
            elapsed = now - self.last_time_called
            left_to_wait = self.min_interval - elapsed

            if left_to_wait <= 0:
                break

            # (sleep removed from here)

    self.last_time_called = timeit.default_timer()
    return func(*args, **kwargs)

Performance of Approach 2

In Linux, this yields a median rate of 4488 packets/sec (0.26% error), which is on par with Windows but eats the CPU the same way, so it's really inefficient.

The Question

Here's what I'm getting at. How do we use sleep() in Linux to keep our CPU usage reasonable while still having decent timing accuracy?

I figure this would have to involve some sort of monitoring and compensation process but I'm not really sure how to go about implementing such a thing. Is there a standard way to approach this kind of error-correction problem?

See also [poll(2)](http://man7.org/linux/man-pages/man2/poll.2.html) & [time(7)](http://man7.org/linux/man-pages/man7/time.7.html). Notice that Python is known to be slow. You might switch to compiled Ocaml or C++ ... — Basile Starynkevitch, Nov 24 '14 at 20:41

score 2 · Accepted Answer · answered Nov 25 '14 at 01:14

The only way to guarantee this is to use real-time OS scheduling. Otherwise, you are at the scheduler's mercy and could be preempted at any time (e.g. if there is some high-priority/low-nice process eating your CPU cycles). Indeed, sleep() is just a convenient way to ask for a preemption of a specific duration. It's always possible you will sleep for substantially longer than you ask. This is why Windows does not even try to sleep for <1ms; it isn't capable of that level of precision once you factor in scheduler nondeterminism. Out of the box, Linux isn't either, but it can be configured (via sched_setscheduler(2)) to be real-time, so it will make the attempt if you ask.