2

I have a multi threaded program in which I sleep in one thread(Thread A) unconditionally for infinite time. When an event happens in another thread (Thread B), it wake up Thread-A by signaling. Now I know there are multiple ways to do it. When my program runs in windows environment, I use WaitForSingleObject in Thread-A and SetEvent in the Thread-B. It is working without any issues. I can also use file descriptor based model where I do poll, select. There are more than one way to do it. However, I am trying to find which is the most efficient way. I want to wake up the Thread-A asap whenever Thread-B signals. What do you think is the best option. I am ok to explore a driver based option.

Thanks

agent.smith
  • 9,134
  • 9
  • 37
  • 49
  • 1
    Is signaling with an event or semaphore insufficiently efficient? Thread A should, usually be made ready immediately with a priority boost from it's previous state of waiting on the signal. If there are no cores free, it will probably preempt the signaler if their base priority is the same. Do you have an actual problem? I can't see how a driver would help any. – Martin James Jul 31 '12 at 21:24
  • I may not have a problem. But what I understand is each mechanism works quite differently. For eg. If I use an event then I request kernel to schedule an event. If I use poll-select then I write to a file which system understands and then wakeup my sleeping thread. So, both have different latencies. From thread-B, I can do an IOCTL from and then signal thread-A. So, I want to find out if there is any particular method which is significantly faster than others. – agent.smith Jul 31 '12 at 21:52
  • Why don't you try benchmarking the various methods? Since your performance counter is common to all threads and processes, create a memory area where Thread B can stamp its signal time and Thread A can stamp its wakeup time (using `QueryPerformanceCounter`). Do this for thousands of repeats and average the latencies. It might be crazy, but you could also try busy-waiting on a volatile flag in Thread A, and have Thread B set the flag and immediately put itself to sleep. – paddy Jul 31 '12 at 23:53
  • 2
    I was working on the same question for a while. Result: Event! Benchmarking will show that it takes less than 2 microseconds from the time the event is set to the time the wait function awakes. But: The priority setting, the proccessor affinity, and the overall system load are influencing the result. – Arno Aug 01 '12 at 06:46
  • 1
    @Arno - yes. If both threads are bound to the same core, the OS does not have to perform any inter-core comms to make thread A running on another core - it can just directly preempt thread B. Of course, this does mean that B is left with no execution - something that could be avoided if the affinity was not twiddled with, and so is likely to be detrimental to overall performance :( – Martin James Aug 01 '12 at 11:14
  • @Amo & Martin: Thanks. My software threads are running on different hardware threads. So, system needs to go thru some pain to wake up first thread. – agent.smith Aug 01 '12 at 17:21

1 Answers1

2

As said, triggering an SetEvent in thread B and a WaitForSingleObject in thread A is fast. However some conditions have to be taken into account:

  • Single core/processor: As Martin says, the waiting thread will preempt the signalling thread. With such a scheme you should take care that the signalling thread (B) is going idle right after the SetEvent. This can be done by a sleep(0) for example.

  • Multi core/processor: One might think there is an advantage to put the two threads onto different cores/processors but this is not really such a good idea. If both threads are on the same core/processor, the time-span between calling SetEventand the return of WaitForSingleObject is much shorter shorter.

  • Handling both threads on one core (SetThreadAffinityMask) also allows to handle the behavior of them by means of their priority setting (SetThreadPriority). You may run the waiting thread at a higher priorty or you have to ensure that the signalling thread is really not doing anything after it has set the event.

  • You have to deal with some other synchronization matter: When is the next event going to happen? Will thread A have completed its task? Most effective a second event can be used to solve this matter: When thread A is done, it sets an event to indicate that thread B is allowed to set its event again. Thread B will effectively first set the event and then wait for the feedback event, it meets the requirment to go idle immedeately.

  • If you want to allow thread B to set the event even when thread A is not finished and not yet in a wait state, you should consider using semaphores instead of events. This way the number of "calls/events" from thread B is kept and the wait function in thread A can follow up, because it is returning for the number of times the semaphore has been released. Semaphore objects are about as fast as events.

Summary:

  • Have both threads on the same core/cpu by means of SetThreadAffinityMask.

  • Extend the SetEvent/WaitForSingleObject by another event to establish a Handshake.

  • Depending on the details of the processing you may also consider semaphore objects.

Arno
  • 4,994
  • 3
  • 39
  • 63
  • Thanks a lot!!! My threads are two different hardware threads and I want them to run that way. So, I need to figure out how can I use signaling across the cores efficiently. – agent.smith Aug 01 '12 at 17:24
  • 1
    I doubt that you will do much better than the OS inter-core comms driver that already exists. It makes 'A' running by issuing a hardware-interrupt to the target core to force it to enter the OS and so 'pick up' thread 'A'. There may well be some gruesome polling method to improve on the latency of this operation only, but to the detriment of everything else :( – Martin James Aug 01 '12 at 19:41
  • @agent.smith: I guess you're not talking about different processors but about different cores of the same processor. My measurements have shown that there is a bit more overhead. I did capture about 3.5 microseconds between `SetEvent()` and `WaitForSingleObject()` to return. The Semaphore approach took some 5 microseconds though. I found that this extra overhead depends on the hardware while I could not see much hardware dependence when doing this on the same core. However, the purpose of multi-cores is to split work to them. – Arno Aug 02 '12 at 10:33
  • @Arno: Thanks a lot!!! I m running threads on different cores. I think I need to stick to SetEvent() and WaitForSingleObject(). I will also try using file descriptor based poll-select and will let you know the latency. I dont think it is gonna be faster that SetEvent. – agent.smith Aug 03 '12 at 17:40