12

I need an ultra-fast MQ mechanism, where both sender and receiver are written in C++, on Windows platform.

My current implementation using RCF-C++ for IPC is clocking at around 20,000 msg/sec over Windows Named Pipes.

I am testing the perf of boost::interprocess Message Queues according to the demo app, and am measuring around 48,000 messages/sec, which is surprisingly slow, considering that when I cooked up a simple Memory Mapped File communication on the same machine (in C# using code from this blog post), I got around 150,000 messages/sec.

Any idea about why I'm getting such slow performance out of boost message_queue, and what I can try to improve it?

Omer Raviv
  • 11,409
  • 5
  • 43
  • 82

2 Answers2

14

Daniel's answer is part of it, but there is a bigger issue here: boost::interprocess basically maintains the queue as an array in shared memory, and upon sending a message, the boost::interprocess:message_queue does a binary search based on the new message's priority to find where the message should be placed in the array, and then std::backward_copys all the other messages to make room for it. If you always use the same priority, your message will be placed at the beginning (since it's the newest), and so whatever messages you have in the buffer at that time will be backwards_copied to make room for it, which takes time. (See implementation of the queue_free_msg method).

If you don't need messages to have priorities, and just want a regular FIFO queue, then this approach is a lot slower than using a Circular Buffer: the performance of insertions (sends) deteriorates rapidly as the size of the queue grows.

UPDATE: I wrote a version of the message_queue that uses a circular buffer internally, with help from the notes on wikipedia, and this was a big success.

Omer Raviv
  • 11,409
  • 5
  • 43
  • 82
  • 2
    Omer, you could send your version of the message_queue that uses a circular buffer to Boost itself. They may accept it! – Pietro Aug 10 '11 at 10:43
  • 1
    @Pietro Too lazy to do that, but you can see the code at https://gist.github.com/3171076 This code needs to be cleaned up - all mention of message priority should be removed, the new implementation completely ignores it. – Omer Raviv Jul 24 '12 at 16:38
  • 9
    I know this is pretty old, but I'm checking to use message_queue in boost, and I found out that it's in boost now (since 1.52: https://www.boost.org/users/history/version_1_52_0.html, see BOOST_INTERPROCESS_MSG_QUEUE_CIRCULAR_INDEX ) – MGamsby May 09 '18 at 18:54
8

As Boost document states, boost::interprocess::shared_memory_object is implemented using memory mapped file in Win32. And, boost's message queue is using that simulated shared memory object as well. (For native Win32 shared memory, boost provides windows_shared_memory class separately.)

For better performance of message queue, therefore, you have to implement your own version of message queue using native Win32 shared memory object. In my experiments, after replacing it, performance increased noticeably.

Note that, if you change to Win32 native shared memory, you must take care of 'deletion' of the shared memory. POSIX shared memory and Win32 shared memory has different policy of deletion.

Daniel K.
  • 947
  • 3
  • 11
  • 21
  • Thanks! Doh. I now notice that indeed during my tests a file was created in the filesystem and used to 'simulate' shared memory. Any chance you could share your implementation, or know a different C++ MQ/RPC framework based directly on Windows native shared memory? I find it hard to believe there isn't an out-of-the-box solution somewhere? – Omer Raviv Jun 02 '11 at 10:30
  • 1
    I tried to change boost message_queue to use windows_shared_memory as you suggested, with the following edits to message_queue.hpp that I found on google: (1) replace detail::managed_open_or_create_impl< windows_shared_memory, false> m_shmem; with detail::managed_open_or_create_impl m_shmem; (2) change the header to include "windows_shared_memory.hpp" instead of "shared_memory_object.hpp" (3) comment out the message_queue::remove to just return true; Now I don't see it creating a file on the file system, but the performance is exactly the same. Any idea? – Omer Raviv Jun 02 '11 at 11:35
  • I think your implementation is almost the same as mine. But the performance can depend on many factors such as the length of messages. Among other things, in my test, opening a message queue was way faster when using Win32 shared memory. – Daniel K. Jun 07 '11 at 01:26
  • Daniel's comment is spot on. Both end up using memory-mapped files, but the windows_shared_memory uses files stored by the system paging file, vs. temporary regular files used by shared_memory_object. This lets windows_shared_memory skip the actual creation of the file on disk (I imagine unless the OS pages out your memory), which can be a huge deal if you are sending very large messages; otherwise, just allocation of the file on disk can take seconds/minutes, even though the subsequent reads/writes are just as fast as windows_shared_memory – aggieNick02 Apr 25 '19 at 20:32
  • 1
    There's even an open ticket in boost from 7 years ago to add support for windows_shared_memory to message queue, but it never seemed to get traction: https://svn.boost.org/trac10/ticket/7027 – aggieNick02 Apr 25 '19 at 20:34