Writing a binary archive into a shared memory with BOOST::serialization

Question

I am currently trying to serialize data as a binary archive into a shared-memory-segment with the BOOST library. I successfully implemented the functionality with a text_oarchive()-method as seen below. Now I wanted to use the binary_oarchive()-method instead of text_oarchive()-method.

shared_memory_object::remove("shm");
shared_memory_object shm(create_only, "shm", read_write);

shm.truncate(sizeof(UnSerData)); // 10MiB
mapped_region region(shm, read_write);

bufferstream bs(std::ios::out);
bs.buffer(reinterpret_cast<char*>(region.get_address()), region.get_size());

boost::archive::text_oarchive oa(bs);

oa << UnSerData;

When implementing the binary_oarchive()-method it fails with: error: call of overloaded ‘binary_oarchive(boost::interprocess::bufferstream&)’ is ambiguous boost::archive::binary_oarchive oa(bs);

shared_memory_object::remove("shm");
shared_memory_object shm(create_only, "shm", read_write);

shm.truncate(sizeof(UnSerData)); // 10MiB
mapped_region region(shm, read_write);

bufferstream bs(std::ios::out);
bs.buffer(reinterpret_cast<char*>(region.get_address()), region.get_size());

boost::archive::binary_oarchive oa(bs);

oa << UnSerData;

Im just not sure which kind of buffer I should be using for the binary_oarchive()-method I already tried the ostream but couldn't get it to work. Thanks already.

EDIT: The JSON-data looks like this:

{
  "name": "UMGR",
  "description": "UpdateManager",
  "dlt_id": "1234",
  "log_mode": ["kConsole"],
  "log_level": "kVerbose",
  "log_dir_path": "",
  "ipc_port": 33,
  "reconnection_retry_offset": 0,
  "msg_buf_size": 1000
}

This is a very simple data example and will get more complex. I use RapidJSON to parse the data into a document object from RapidJSON. Then the data gets parsed into a struct looking like this:

typedef struct{
    string name;
    string description;
    string dlt_id;
    string log_mode;
    string log_level;
    string log_dir_path;
    uint ipc_port;
    uint reconnection_retry_offset;
    uint msg_buf_size;
    int checksum;

//function for serializing the struct
template <typename Archive>
void serialize(Archive& ar, const unsigned int version)
{
    ar & name;
    ar & description;
    ar & dlt_id;
    ar & log_mode;
    ar & log_level;
    ar & log_dir_path;
    ar & ipc_port;
    ar & reconnection_retry_offset;
    ar & msg_buf_size;
    ar & checksum;
}
} UMGR_s;

This is probably not the most "efficent" way of parsing JSON data but it is not my goal to reduce the interpreter speed itself but the optimization of the whole system. Since I am comparing this approach to the current attempt which I also implemented with this JSON parser the results should remain meaningful.

I also thought about using memory mapping instead of a shared memory implementation. Because the daemon has to open the file (with the serialized data) anyway and pass it to the process. So maybe it would be more efficient to just let the receiving process gather the data via a memory-mapped implementation from the boost library.

score 1 · Accepted Answer · answered Jul 21 '20 at 16:29

I cannot reproduce the error you describe:

Compiling On Coliru

Using a filemapping allows us to even run it on COLIRU:

Live On Coliru

Prints

00000000: 3232 2073 6572 6961 6c69 7a61 7469 6f6e  22 serialization
00000010: 3a3a 6172 6368 6976 6520 3137 2030 2030  ::archive 17 0 0
00000020: 0a00 0000 0000 0000 0000 0000 0000 0000  ................
00000030: 0000 0000 0000 0000 0000 0000 0000 0000  ................
*
000027f0: 0000 0000 0000 0000 0000 0000 0000 0000  ................

Thoughts

Why are you "sharing" like this? If you're serializing/deserializing anyways then it seems like you're buying nothing over reading/writing regular files. They, too, are shared between processes.
There's no way to exactly predict the size of the archive. In fact, the fact that you use sizeof(UnSerData) is a red flag: binary serialization might not be what you think it is.
The actual size of serialization archives has overhead and need not correspond to the object (just think: sizeof(std::string) is a compile-time constant. Yet, if the string becomes larger you will get a larger serialization archive).
For an idea of the amount(s) of overhead and factors that influence it:
It looks like you might be more interested in bitwise serialization
- How to use boost::serialization with nested structs and minimal code changes?
- Or roll your own Boost Serialization Binary Archive giving incorrect output

Back In The Box

Since you're using shared memory, probably for a reason, don't you just want to skip the whole step of serializing?

Depending on your data this could be very simple, or require some work.

It would be Very Simple (TM) if your Data type is POD. In that case you can expect to store a copy in the mapped region of size(UnSerData) (and only then).

If your type uses internal pointers or allocation, I suggest managed_shared_memory instead. The BIP allocator uses offset_ptr which is safe to use in the shared memory area and subsequently you require no serialization (just synchronization) to access from other processes.

I have plenty of examples of using managed_shared_memory and allocator/scoped_allocator_adaptor on this site, with varying degrees of complexity in case you want to have a look.

Wow, thank you so much for this detailed answer. The goal is to reduce the size of the data(UnSerData). The data is interpreted from a JSON file at the inital start of the system. After interpreting and serializing the JSON file the data shall be distributed to the process via SHM. When shutting down the system the serialized data is stored into a file. When the system is now restarted that serialized data is read from the file and distributed via SHM to the processes. So the interpreter effort is gone and the file is read faster since it is smaller than the inital JSON file. — mxmlntr, Jul 22 '20 at 08:56
Yeah. Sounds like you should follow one goal more clearly. When yu (de)serialize, data is being copied **and** interpreted all the same. If you can show me what the data would actually look like, I can help. But also read this [similar question](https://stackoverflow.com/questions/62809697/mocking-a-boost-shared-memory-derived-class-with-gtest/62817366#62817366). — sehe, Jul 22 '20 at 10:57
To the edit: [shared memory approach](https://godbolt.org/z/MfGK17). Using a mapped file for [online execution](http://coliru.stacked-crooked.com/a/aaba323ff3b3bf9c). — sehe, Jul 25 '20 at 00:03
Expanded with advanced stuff like nested containers (log_mode is now `set`, there's a list of `UMGR_s` now instead of just one): http://coliru.stacked-crooked.com/a/a386c351bdd2d7d9 — sehe, Jul 25 '20 at 00:50
Wow, thank you so much for your effort. The community should be proud to have someone like you >:) — mxmlntr, Jul 27 '20 at 06:51

score 0 · Answer 2 · answered Jul 18 '22 at 04:11

0

@sehe, You cannot reproduce @mxmlntr issue because as he said the problem just occurs when he is using the boost::archive::binary_oarchive and in your code you are using boost::archive::text_oarchive which the original post shows it is NOT a problem.

answered Jul 18 '22 at 04:11

plinioandrade

66
7

Writing a binary archive into a shared memory with BOOST::serialization

2 Answers2

Thoughts

Back In The Box