Boost.Interprocess notify() performance

Question

I have two processes A and B that should exchange quickly data through shared memory using Boost.Interprocess on Windows 10. My problem : the time between notify_all() and the wait() seems to be quite slow (15 milliseconds regularly). I ended to write a simple application to reproduce this problem using QueryPerformanceCounter under Windows to have accurate timing. The init internals of PerformanceCounter are put in the shared memory so that the comparison of the time between A and B is right.

I use this common structure that is put in shared memory:

namespace bi = boost::interprocess;
struct SharedStruct
{
    double          time; // Time to share between process A and B

    // Sync objects
    bi::interprocess_mutex      mutex;
    bi::interprocess_condition  cond_data_ready;
    volatile bool               data_ready;
    volatile bool               exited;

    // timing information for debug
    long long       counter_start;
    double          pc_freq;
};

The first process create the shared memory, initialize the timing information and then run this function :

// Process A (producer)
void Run()
{
    for (int i = 0; i != 10; ++i)
    {
        Sleep(1000); // Sleep 1 sec to simulate acquiring data
        // In my real code, this is not a Sleep but a WaitForMultipleObjects() call
        double counter;
        {
            bi::scoped_lock<mutex_type> lock(m_main_struct->mutex);
            double counter = GetCounter();
            m_main_struct->time = counter;
            m_main_struct->data_ready = true;
        }
        m_main_struct->cond_data_ready.notify_all();
    }
    bi::scoped_lock<mutex_type> lock(m_main_struct->mutex);
    m_main_struct->exited = true;
    m_main_struct->cond_data_ready.notify_all();
}

The process B opens the already existing shared memory and reads data.

// Process B : read and print timestamps diff
void Run()
{
    bi::scoped_lock<mutex_type> lock(m_main_struct->mutex);
    while (!m_main_struct->exited)
    {
        m_main_struct->cond_data_ready.wait(lock);
        if(m_main_struct->data_ready)
        {
            m_main_struct->data_ready = false;
            double counter = GetCounter();
            double diff = counter - m_main_struct->time;
            std::cout << boost::lexical_cast<std::string>(counter) << " and the diff is : " << diff << std::endl;
        }
    }
    std::cout << "Exiting..." << std::endl;
}

The output is something like :

3011.8175661852633 and the diff is : 0.0160456
4044.836078038385 and the diff is : 15.6379
5061.9643649653617 and the diff is : 15.5186
6079.200594853236 and the diff is : 15.6448
7075.2902152258657 and the diff is : 0.0218803
8119.8910797177004 and the diff is : 26.4905
9099.8308285825678 and the diff is : 0.0379259
10122.977664923899 and the diff is : 0.0393846
11140.647854688354 and the diff is : 0.0145869
12158.33992478272 and the diff is : 0.0237037
Exiting...

Sometimes the time between the notify_all() and the wait() is less that 1 millisecond (which is fine) but sometimes it is about 15 milliseconds which is far too much I think. If I use Windows events for notifying/waiting, then this problem disappear (0.05msec approx.).

Is it normal? Any idea?

Does not sound excessive. What is the expected latency on scheduling a thread on your OS? — Richard Critten, Nov 13 '17 at 17:27
15.625 msec is a magic number, that is the default clock interrupt rate. It affects that Sleep() call, you'll call notify_all() a handful of nanoseconds after the thread scheduler did its job. How eager it might be to do the job again so quickly after it just did it and get the wait() to complete is questionable, there has to be some protection against it. Consider doing this test without Sleep(), burning core for a random while like a thread would normally do. — Hans Passant, Nov 13 '17 at 17:48
@HansPassant I did not know that. But that seems really to be my case. In my real life program, I have no `Sleep()` at all, but a `WaitForMultipleObjects()` call instead. That said, according to the documentation it suffers from the very same problem. Today I've tried to use windows events for notifying/waiting and surprise I don't have this problem anymore. But this approach is not compatible with boost.interprocess as the wait must unlock the mutex atomically. So I guess there must be a solution to this problem... — poukill, Nov 14 '17 at 16:43
@HansPassant By the way, when you say "burning core for a random while like a thread would normally do", well this is not what I am doing in my application. The example from my post is almost my use case! I can have large data but not so often, like one image or vector per second. I work on a event-based system. I don't need heavy CPU calculation but I need to be fast to give that information. Let's say for example I work on real-time security detection, then 15.6 msec is too much. :) — poukill, Nov 15 '17 at 15:14
Unrelated to your inquiry, but... the [example for `notify_all()` given by the Boost docs](https://theboostcpplibraries.com/boost.interprocess-synchronization#ex.interprocess_14) had me stumped, so I googled for "boost interprocess notify_all", came here... and was enlightened. :-) Thanks for this bit of code! — DevSolar, Aug 05 '19 at 09:10

Boost.Interprocess notify() performance

0 Answers0

Linked