5

I am doing a performance evaluation between Windows CE and Linux on an arm imx27 board. The code has already been written for CE and measures the time it takes to do different kernel calls like using OS primitives like mutex and semaphores, opening and closing files and networking.

During my porting of this application to Linux (pthreads) I stumbled upon a problem which I cannot explain. Almost all tests showed a performance increase from 5 to 10 times but not my version of win32 events (SetEvent and WaitForSingleObject), CE actually "won" this test.

To emulate the behaviour I was using pthreads condition variables (I know that my implementation doesn't fully emulate the CE version but it's enough for the evaluation).

The test code uses two threads that "ping-pong" each other using events.


Windows code:

Thread 1: (the thread I measure)

HANDLE hEvt1, hEvt2;
hEvt1 = CreateEvent(NULL, FALSE, FALSE, TEXT("MyLocEvt1"));
hEvt2 = CreateEvent(NULL, FALSE, FALSE, TEXT("MyLocEvt2"));

ResetEvent(hEvt1);
ResetEvent(hEvt2);

for (i = 0; i < 10000; i++)
{
    SetEvent (hEvt1);
    WaitForSingleObject(hEvt2, INFINITE);
}        

Thread 2: (just "responding")

while (1)
{
    WaitForSingleObject(hEvt1, INFINITE);
    SetEvent(hEvt2);
}

Linux code:

Thread 1: (the thread I measure)

struct event_flag *event1, *event2;
event1 = eventflag_create();
event2 = eventflag_create();

for (i = 0; i < 10000; i++)
{
    eventflag_set(event1);
    eventflag_wait(event2);
}

Thread 2: (just "responding")

while (1)
{
    eventflag_wait(event1);
    eventflag_set(event2);
}

My implementation of eventflag_*:

struct event_flag* eventflag_create()
{
    struct event_flag* ev;
    ev = (struct event_flag*) malloc(sizeof(struct event_flag));

    pthread_mutex_init(&ev->mutex, NULL);
    pthread_cond_init(&ev->condition, NULL);
    ev->flag = 0;

    return ev;
}

void eventflag_wait(struct event_flag* ev)
{
    pthread_mutex_lock(&ev->mutex);

    while (!ev->flag)
        pthread_cond_wait(&ev->condition, &ev->mutex);

    ev->flag = 0;

    pthread_mutex_unlock(&ev->mutex);
}

void eventflag_set(struct event_flag* ev)
{
    pthread_mutex_lock(&ev->mutex);

    ev->flag = 1;
    pthread_cond_signal(&ev->condition);

    pthread_mutex_unlock(&ev->mutex);
}

And the struct:

struct event_flag
{
    pthread_mutex_t mutex;
    pthread_cond_t  condition;
    unsigned int    flag;
};

Questions:

  • Why doesn't I see the performance boost here?
  • What can be done to improve performance (e.g are there faster ways to implement CEs behaviour)?
  • I'm not used to coding pthreads, are there bugs in my implementation maybe resulting in performance loss?
  • Are there any alternative libraries for this?
dacwe
  • 43,066
  • 12
  • 116
  • 140
  • 2
    You might check this implementation http://stackoverflow.com/questions/178114/pthread-like-windows-manual-reset-event . Buy don't expect any increase in performance. Linux simple doesn't support something like events natively and emulation will never be such fast. It is the same story as with conditional variable in Windows. Old windows versions doesn't support it and emulating is also quite complex. – Zuljin Jan 16 '12 at 18:04
  • @Zuljin: The implementation there is almost identical to my version. :-( – dacwe Jan 16 '12 at 18:18
  • Can't it be implemented with some other primitive then? – dacwe Jan 16 '12 at 18:25
  • All the testing I've done or seen has CE winning handily for synchronization objects, so your results are what I'd expect. You'll likely find that Linux's netywork stack, however, is significantly faster than CE's (last I did testing it definitely was). – ctacke Jan 16 '12 at 18:34

2 Answers2

3

Note that you don't need to be holding the mutex when calling pthread_cond_signal(), so you might be able to increase the performance of your condition variable 'event' implementation by releasing the mutex before signaling the condition:

void eventflag_set(struct event_flag* ev)
{
    pthread_mutex_lock(&ev->mutex);

    ev->flag = 1;

    pthread_mutex_unlock(&ev->mutex);

    pthread_cond_signal(&ev->condition);
}

This might prevent the awakened thread from immediately blocking on the mutex.

Michael Burr
  • 333,147
  • 50
  • 533
  • 760
  • 1
    Shouldn't it be cond_signal first and mutex_unlock following? By having them in this order you leave a small potential that some other actor lock the mutex before you do cond_signal and this way the signal waiter will not atomically lock the mutex just after receiving the signal. – Ethouris May 13 '15 at 06:51
0

This type of implementation only works if you can afford to miss an event. I just tested it and ran into many deadlocks. The main reason for this is that the condition variables only wake up a thread that is already waiting. Signals issued before are lost.

No counter is associated with a condition that allows a waiting thread to simply continue if the condition has already been signalled. Windows Events support this type of use.

I can think of no better solution than taking a semaphore (the POSIX version is very easy to use) that is initialized to zero, using sem_post() for set() and sem_wait() for wait(). You can surely think of a way to have the semaphore count to a maximum of 1 using sem_getvalue()

That said I have no idea whether the POSIX semaphores are just a neat interface to the Linux semaphores or what the performance penalties are.

Taryn
  • 242,637
  • 56
  • 362
  • 405
everclear
  • 332
  • 4
  • 12
  • that is why a condition variable is always associated to... a condition. you use a mutex to ensure coherent access to that condition, and you check for it in a while loop. (to account for spurious wake ups, and concurrently mutex acquiring threads that may have changed the condition after the notify.) – v.oddou Mar 22 '13 at 09:25
  • The problem I'm going for is an event that occurs once. Think of a thread that dispatches other threads. For example you want to continue execution of the dispacher only if the dispached thread has actually run for some time. For this I use two semaphores: The dispatcher creates the thread and posts a 'dispatch' semaphore. The dispatched thread waits for it. After setting the dispatch semaphore the dispatcher waits for a worker semaphore to be posted by the worker. This mechanism fails using the events above because in most cases the worker will miss the dispatcher's signal and block forever. – everclear Mar 24 '13 at 13:07
  • Well yes of course. that is why condition variables do not focus on the event, but on the condition. your condition is a variable, for example a bool that flags 'go for it !' that your worker will read. If it is false, it waits on the condition_variable until a signal, it wakes up, checks the 'go for it flag'. If the flag is true it means the signal was sent before the worker arrived at that point so it doesn't even wait. the mutex is here to access this flag with the correct memory barriers. – v.oddou Mar 25 '13 at 09:04