0

If I have 4 working threads and 1 I/O thread running on a quad-core, one of the threads will be overlapped with another. How do i make sure that it is the input thread that is always overlapped with another so that i can sched_yield() to give up its current time slice to the other thread. If it is two worker threads that are overlapped, a yield on the input thread will not have any effect, right? Will sched_yield bring another thread from a different core anyway?

#include <sched.h>
#include <pthread.h>
void test(void*) {
   while(1) {}
}
int main (void) {
   pthread_t t; 
   for(int i = 0;i < 4;i++)
       pthread_create(&t,0,(void*(*)(void*))test,0); //workers
   while (1) {
       sched_yield(); //input thread
   }
   return 0;
}

Edit The input thread needs to poll for incoming messages. The library i am using (MPI) is not interrupt driven and condition variables are useless in this context. What i want to do in the input thread is check for a condition once, and give up on its time slice. If there are enough cores to run all threads, the input thread will run on its own core. If not, it will run minimum number of checks i.e. once per time slice. I hope i am clear enough.

danny
  • 1,101
  • 1
  • 12
  • 34
  • Should the I/O thread do nothing while the workers are running? Then you could simply count the number of workers finished and use a condition variable in order to block the I/O thread. – Zeta Mar 09 '13 at 00:18
  • 3
    You need to read an OS principles book. Sooner, rather than later. All modern OS are preemptive, and 99% of the reason for that is optimizing I/O performance. – Martin James Mar 09 '13 at 00:19
  • @Zeta The input-output thread is not blocked on input. It needs to poll for messages coming through network and also do other bookkeeping stuff. – danny Mar 09 '13 at 00:50
  • `sched_yield` is not the way you manage this. Proper use of synchronization primitives such as condition variables will ensure that the threads that can make forward progress are executing and the ones that can't are blocked. – R.. GitHub STOP HELPING ICE Mar 09 '13 at 00:52
  • Please see my edit and give me feedback. Thanks. – danny Mar 09 '13 at 00:58
  • Hmm, I would not try to run your sample programme on my computer, it looks like it could burn the CPU down to ashes. – didierc Mar 09 '13 at 03:16

2 Answers2

1

The Googleable phrase you are looking for is "CPU affinity".

See for example this SO question.

Ensuring that each of the worker threads is running on a different core will achieve your stated goal.

I think a number of commenters have posted some legitimate concerns about the design of your application, and you might want to consider extending those conversations just to make sure the design you have in your head will actually effectively accomplish the end goal you want to achieve.

Community
  • 1
  • 1
Nathan
  • 1,218
  • 3
  • 19
  • 35
1

Hmmm, MPI_recv claims to be blocking unless you do something specific to change that. MPI's underlying comms infrastructure is elaborate, and I don't know if 'blocking' extends as far as waiting on a network socket with a call to select(). You're sort of stating that it doesn't, which I can well believe given MPI's complexity.

MPI's Internals

So if MPI_recv in blocking mode inevitably involves polling one needs to work out exactly what the library underneath is doing. Hopefully it's a sensible poll (ie, one involving a call to nanosleep()). You could look at the Open MPI source code for that (eek), or use this and GTKWave to see what its scheduling behaviour is like in a nice graphical way (I'm assuming you're on Linux).

If it is sleeping in the polling loop then the version of the Linux kernel matters. More modern kernels (possibly requiring the PREEMPT_RT patch set - I'm afraid I can't remember) do a proper timer driven de-scheduled sleep even for short periods, so taking no CPU time. Older implementation would just go into a busy loop for short sleeps, which is no good to you.

If it's not sleeping at all then it's going to be harder. You'd have to use MPI in a non-blocking mode and do the polling / sleeping yourself.

Thread Priorities

Once you've got either your or MPI's code polling with a sleep you can then rely on using thread priorities and the OS scheduler to sort things out. In general putting the I/O thread at a higher priority than the worker threads is a good idea. It prevents the process at the other end of the I/O from being blocked by your work threads pre-empting your I/O thread. For this reason sched_yield() isn't a good idea because the scheduler won't put your thread to sleep.

Thread Affinity

In general I wouldn't bother with that, at least not yet. You've 5 threads and 4 cores; one of those threads will always be disappointed. If you let the kernel sort things out as best it can then provided you've got control of the polling (as described above) you should be fine.

--EDIT--

I've gone and had another look at MPI and threads, and re-discovered why I didn't like it. MPI intercommunicates for processes, each of which has a 'rank'. Whilst MPI is/can be thread-safe, a thread in itself doesn't have its own rank. So MPI is not capable of intercommunicating between threads. That's a bit of a weakness in MPI in these days of multi-core devices.

However you could have 4 separate processes and no I/O thread. That's likely to be less than optimal in terms of how much data is copied, moved and stored (it'll be 4x the network traffic, 4x the memory used, etc). However, if you've a large enough compute-time:I/O-time ratio you might be able to stand that inefficiency for the sake of simple source code.

bazza
  • 7,580
  • 15
  • 22
  • BTW I backed off from using MPI in my application (a large real time signal processing problem on some delightfully exotic hardware). I wasn't able to discover enough about what MPI itself does to allow me to be sure of meeting my real time requirements without it getting in the way, and I had a nagging doubt about its use in multi-threaded programs. So I wrote my own data transfer API layered on top of the physical transport (serial RapidIO in my case) just to be sure. MPI struck me as being aimed at the HPC guys, not the real time community. – bazza Mar 09 '13 at 08:49