16

Desired behaviour: run a multi-threaded Linux program on a set of cores which have been isolated using isolcpus.

Here's a small program we can use as an example multi-threaded program:

#include <stdio.h>
#include <pthread.h>
#include <err.h>
#include <unistd.h>
#include <stdlib.h>

#define NTHR    16
#define TIME    60 * 5

void *
do_stuff(void *arg)
{
    int i = 0;

    (void) arg;
    while (1) {
        i += i;
        usleep(10000); /* dont dominate CPU */
    }
}

int
main(void)
{
    pthread_t   threads[NTHR];
    int     rv, i;

    for (i = 0; i < NTHR; i++) {
        rv = pthread_create(&threads[i], NULL, do_stuff, NULL);
        if (rv) {
            perror("pthread_create");
            return (EXIT_FAILURE);
        }
    }
    sleep(TIME);
    exit(EXIT_SUCCESS);
}

If I compile and run this on a kernel with no isolated CPUs, then the threads are spread out over my 4 CPUs. Good!

Now if I add isolcpus=2,3 to the kernel command line and reboot:

  • Running the program without taskset distributes threads over cores 0 and 1. This is expected as the default affinity mask now excludes cores 2 and 3.
  • Running with taskset -c 0,1 has the same effect. Good.
  • Running with taskset -c 2,3 causes all threads to go onto the same core (either core 2 or 3). This is undesired. Threads should distribute over cores 2 and 3. Right?

This post describes a similar issue (although the example given is farther away from the pthreads API). The OP was happy to workaround this by using a different scheduler. I'm not certain this is ideal for my use-case however.

Is there a way to have the threads distributed over the isolated cores using the default scheduler?

Is this a kernel bug which I should report?

EDIT:

The right thing does indeed happen if you use a real-time scheduler like the fifo scheduler. See man sched and man chrt for details.

Community
  • 1
  • 1
Edd Barrett
  • 3,425
  • 2
  • 29
  • 48
  • Personally, I wouldn't have expected such a behavior. Send an email to LKML to check why the default scheduler is not able of migrating across isolated cores that have been assigned through `taskset`. – Claudio Apr 14 '16 at 10:03

1 Answers1

5

From the Linux Kernel Parameter Doc:

This option can be used to specify one or more CPUs to isolate from the general SMP balancing and scheduling algorithms.

So this options would effectively prevent scheduler doing thread migration from one core to another less contended core (SMP balancing). As typical isolcpus are used together with pthread affinity control to pin threads with knowledge of CPU layout to gain predictable performance.

https://www.kernel.org/doc/Documentation/kernel-parameters.txt

--Edit--

Ok I see why you are confused. Yeah personally I would assume consistent behavior on this option. The problem lies around two functions, select_task_rq_fair and select_task_rq_rt, which is responsible for selecting new run_queue (which is essentially selecting which next_cpu to run on). I did a quick trace (Systemtap) of both functions, for CFS it would always return the same first core in the mask; for RT, it would return other cores. I haven't got a chance to look into the logic in each selection algorithm but you can send an email to the maintainer in Linux devel mailing list for fix.

Wei Shen
  • 2,014
  • 19
  • 17
  • OK, but I don't understand why the threads spread out when forced onto isolated cores and a real-time scheduler (e.g. fifo) is used instead of the default scheduler. – Edd Barrett Apr 14 '16 at 09:01
  • @EddBarrett Trace how `isolcpus` parameter is interpreted in the default and RT schedulers to get the exact implementation within the Linux kernel. This would be `sched.c` vs. `sched_rt.c` (for older versions) or `sched/core.c` and `sched/rt.c` in the more recent versions of the Linux kernel. – TheCodeArtist Apr 14 '16 at 10:41
  • I've just raised a bug for this: https://bugzilla.kernel.org/show_bug.cgi?id=116701 – Edd Barrett Apr 19 '16 at 14:21
  • 1
    @thecodeartist, @wei-shen I've put some effort into tracing `select_task_rq_fair`. Outcome here: https://bugzilla.kernel.org/show_bug.cgi?id=116701 – Edd Barrett May 09 '16 at 10:00