0

My task is to call the C API functions with some complex data which comes in text form over the network. One may think of some form of RPC. The functions are called very frequently, therefore the performance requirements are very strict. Currently I directly parse the input stream to the corresponding POD structure with manually written parser. The problem is that this parser is huge. And yes, sometimes I find bugs in it. I would like to try to switch to boost in order to decrease the complexity of my code. Hopefully I could also increase the performance in case I could use the memory pools. In current solution adding such a complexity looks daunting taking into account that all this is multithreaded.

The simplified input data looks like functionName({x,x,x,{x,x,x},<null>,x}) where each x is a primitive value of some type or a text string (represented like "blah" or <null>). Each missing structure is represented by <null>.

The output data is a POD structure. In case of string or nested structure as a field, the pointer to the allocated data is stored in the outer one. It could be null in case the value is missing.

Digging through the SO answers and through the boost documentation I could not find how to accomplish this task efficiently, i.e. without rebuilding the structure into POD after parsing it into some internal "boost-friendly" form.

So again, repeating the subject, the question is how to parse the nested POD structures which have pointers?

Any help is appreciated.

avp
  • 4,895
  • 4
  • 28
  • 40
  • I've implemented a similar functionality once - for the purpose of data serialization ( I wanted to save C++ objects to a file and then load them back). That plus support for patching (code versioning) resulted in a small module, but it wasn't huge. But I used pure C++ for the purpose... – Piotr Trochim Mar 14 '17 at 16:51
  • post some code that you think ought to do what you want. You will need to keep a "global" context object which records whether objects have been seen before. – Richard Hodges Mar 14 '17 at 16:56
  • 1
    You might want to take a glance at https://capnproto.org/ , written by Kenton Varda who previously worked on Google Protobuf. There is a lot to be said for using an existing solution which someone else is maintaining for you :) – rici Mar 14 '17 at 17:50

1 Answers1

1

With the objectives stated, I'd consider either Boost Serialization or Boost Managed Buffers (optionally in shared memory) from Boost Interprocess.

Boost Serialization

Use it with EAS portable archives and the various flags to avoid some overheads

Boost Interprocess

This would shine if the object graph is "complicated" (e.g. might contain a multi_index_container, or interrelated non-PODs that might e.g. use shared objects representations for compression). This is somewhat more involved, but makes anything you do in the Managed Buffer (using standard allocators) bitwise serializable.

There's no portable format here: the library is portable, but data serialized on one platform/version cannot be read on a different platform/version.

See e.g.

Many others, depending on what you want to achieve.

Community
  • 1
  • 1
sehe
  • 374,641
  • 47
  • 450
  • 633
  • Thank you for the answer. Does it mean that boost::spirit cannot do the task? – avp Mar 15 '17 at 13:01
  • Of course not. I just happen to concur with most people that this is not worth doing the plumbing for. I've used Spirit to parse JSON serializations (generically) using complicated machinery to do the "reflection" and OData annotations, but I really don't think it was worth the trouble. Use a library. Protobuf, Cap'n proto, Cereal, Boost Serialization etc. – sehe Mar 15 '17 at 13:54
  • (PS. if you might think I'm not a [tag:boost-spirit] fan, just look at the [top users for that tag](http://stackoverflow.com/tags/boost-spirit/topusers) :)) – sehe Mar 15 '17 at 13:55