0

I'm using the libx264 library to compress video data to... x264.

I used the default to have the library create as many (few) threads as it wants:

param.i_threads = X264_THREADS_AUTO;

This works great on my server which has 64 processors (2 CPUs with 16 cores each and Intel Threading). It will actually use about 5 threads.

However, on the embedded computer running the software, I only have 4 CPUs. It is a Xeon so there is not very many issues there, but somehow it prevents the USB port from functioning. We're receiving data from that USB port and when the 4 CPUs are used at about 100%, the libx264 code takes over the whole computer pretty bad.

I'm thinking of two solutions, use 3 as the maximum number of threads:

param.i_threads = 3;

or have those libx264 thread have a (much) higher nice value so the other things running on that computer don't get blocked (i.e. the CPU is better shared; the other things do not use much CPU, it's usually well under 10%).

However, I do not have control of how the libx264 library creates the threads and was wondering whether it would work for me to change the nice value before calling the libx264 functions that create the threads and as a result have those threads use that nice value, something like this:

nice(10);
...call libx264 functions...
nice(0);

The will that make those threads use a nice value of +10? From what I can see in the pthread_create() man page, it doesn't clearly say that a thread inherit the parent's thread nice value...


Note 1: I'm aware that the issue is not unlikely the fact that the USB port is probably fighting for the DMA against the video capture card... if that is the case, we'll obviously not resolve any problem just by changing the priority of processes. I'd like to try that soft solution first.

Although I can move the USB port to another computer, the data would come through the network which could very well have a similar hardware conflict issue.


Note 2: I don't want to have to recompile the libx264 and change that code. That's way outside my project scope.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Alexis Wilke
  • 19,179
  • 10
  • 84
  • 156
  • 4
    "*wondering whether it would work*". Surely that can be answered or at least investigated by just trying it. Run the code and look at the nice value of the threads? – kaylum Oct 10 '21 at 21:38
  • `man 7 pthreads` on my Ubuntu 18 system says *Threads do not share a common nice value.* fwiw. I'm not sure if that just means that individual threads can have different values, or if they also don't inherit one from the creating thread. – Shawn Oct 10 '21 at 21:44
  • 1
    Niceness is inherited from the parent process/thread, AFAIK. You can always `renice(2)` threads after they're started, raising their niceness, given their TIDs (which you can use as PIDs in a system call). You can even do that from outside the process entirely, manually with `top` or `htop`, or `ps` + `renice(1)`, as a one-off experiment without changing any code. – Peter Cordes Oct 10 '21 at 21:49
  • 1
    Keep in mind that without root permission, you can only ever make things *more* nice, not reduce nice back down to 0 (lower niceness, higher priority). So to have some threads be more nice, you could write a wrapper function that uses nice(2) and then tail-calls into an x264 function, so the parent thread doesn't have to change its own niceness. Any more threads x264 starts from that thread will be at least as nice as that. – Peter Cordes Oct 10 '21 at 21:50
  • 1
    Re: your actual problem; I'd guess that keeping all hyperthreads busy might be costing a lot of memory bandwidth / cache footprint. You might not see much throughput drop from only letting x264 start one thread per *physical* core, but leaving more bandwidth for DMA. As well as leaving cores free for more CPU-intensive USB stuff if necessary. (I doubt that bottom-half interrupt handlers are lower priority than even `nice(-19)`, but if there are other user-space processes involved then maybe `nice` matters..) – Peter Cordes Oct 10 '21 at 21:53
  • @kaylum As a matter of fact, no, I can't easily test it. The system's some 600 miles away. – Alexis Wilke Oct 10 '21 at 22:22
  • 1
    @AlexisWilke So not even a small locally built/run test program? – kaylum Oct 10 '21 at 22:23
  • @kaylum In regard to individual threads having their own nice value, [this is the case](https://stackoverflow.com/questions/7684404/is-nice-used-to-change-the-thread-priority-or-the-process-priority). – Alexis Wilke Oct 10 '21 at 22:27
  • 3
    The answer to your question: *"Then will that make those threads use a nice value of +10?"* is ***yes***. You could confirm this by writing a simple test program and running it, using `htop` or any other tool to check the niceness. Not sure what else there is to say here. Niceness of processes/threads is something that has been talked about ad nauseam all over the internet... – Marco Bonelli Oct 10 '21 at 22:28
  • @MarcoBonelli Yes. I think that was the answer I was looking for. That the new thread inherit the parent's nice priority. – Alexis Wilke Oct 10 '21 at 23:33

1 Answers1

0

First, the X264 library creates the threads at the time you open the handle with x264_encoder_open(). That is the only function that needs the nice value updated.

As pointed out in a comment by Peter Cordes, the following only works if you are running as root (which is generally not recommended, even for daemons):

nice(10);
...call libx264 functions...
nice(0);

Also that was wrong since nice(0) does nothing more than return the current nice value. The correct call is nice(-10).

One solution is to give the process permissions to become root and when trying to do the nice(-10), become root just before and drop the permissions just after.

// change nice(2) to non-preemptive
int inc_nice = 10;
nice(inc_nice);

// create the X264 threads
x264_encoder_open(&params);

// restore the nice value
uid_t user(getuid());
seteuid(0);
nice(-inc_nice);
seteuid(user);

Be careful, if you are running with multiple threads, this code is not safe. At the time seteuid(0) returns, your other threads will have root permissions. Assuming your application is safe, it should not be a big issue, but still something to keep in mind.

For a systemd service to have the right to become root you need these things:

  1. This assumes that you are using the User and Group options to drop the privileges on startup. If your daemon runs as root, then you don't even need the special handling. You can just call the nice(-10) and it will work.

    The following is used by web servers on Debian:

     User=www-data
     Group=www-data
    
  2. Give the process the privilege:

    Either set the NoNewPrivileges to false (or do not include it since that's the default):

     NoNewPrivileges=false
    
  3. Give the process the option to become root:

    You need to make sure that the process owner is root and that the 's' flag (a.k.a. set user on execution) is set:

     chown root /usr/sbin/my-app
     chmod u+s /usr/sbin/my-app
    

    In most cases, if you use the standard packager of your Linux system, files installed under /usr/sbin will already be owned by root. However, the 's' flag is not set by default. When creating a Debian package, you can add a debian/rules file with the following:

     override_dh_fixperms:
         dh_fixperms
         chmod u+s debian/project-name/usr/sbin/my-app
    

    Note: the rules file is a makefile so the indentation is expected to be a tab.

Alexis Wilke
  • 19,179
  • 10
  • 84
  • 156