9

I have code that currently passes around a lot of (sometimes nested) C (or C++ Plain Old Data) structs and arrays.

I would like to convert these to/from google protobufs. I could manually write code that converts between these two formats, but it would be less error prone to auto-generate such code. What is the best way to do this? (This would be easy in a language with enough introspection to iterate over the names of member variables, but this is C++ code we're talking about)

One thing I'm considering is writing python code that parses the C structs and then spits out a .proto file, along with C code that copies from member to member (in either direction) for all of the types, but maybe there is a better way... or maybe there is another IDL that already can generate:

  1. .h file containing all of nested types
  2. .proto file containing equivalents
  3. .c file with functions that copy either direction between the C++ structs that the .proto file generates and the structs defined in the .h file
Andrew Wagner
  • 22,677
  • 21
  • 86
  • 100
  • I am a bit confused, are you passing around *data* or *code*? If all you are passing around is data, then what's wrong with any serialisation library? If your data needs to be read by different languages, I'd consider something like `json` or similar. If you are interested in sharing code, then this is a different problem altogether. ROS uses python and some library to generate C++ classes from messages, and I'm sure there is a lot of other frameworks out there supporting code generation. Why are you using protobuf? – Ælex Sep 05 '18 at 09:35

4 Answers4

1

Protocol buffers can be built by parsing an ASCII representation using TextFormat. So one option would be to add a method dumpAsciiProtoBuf to each of your structs. The method would dump any simple fields (like strings, bools, etc) and call dumpAsciiProtoBuf recursively on nested structs fields. You would then have to make sure that the concatenated result is a valid ASCII protocol buffer which can be parsed using TextFormat.

Note though that this might have some performance implications (since parsing the ASCII representation could be expensive). However, this would save you the trouble of writing a converter in a different language, so it seems to be a convenient solution.

P.W
  • 26,289
  • 6
  • 39
  • 76
Dino
  • 1,576
  • 12
  • 12
  • Thanks for the reply. I don't think a C++ class can iterate through it's own member variables though, can it? I'm trying to avoid having to maintain multiple pieces of code that iterate through hard-coded structure fields. – Andrew Wagner Oct 26 '12 at 12:16
1

I could not find a ready solution for this problem, if there is one, please let me know!

If you decide to roll your own in python, the python bindings for gdb might be useful. You could then read the symbol table, find all structs defined in specified file, and iterate all struct members. Then use <gdbtype>.strip_typedefs() to get the primitive type of each member and translate it to appropriate protobuf type.

This is probably safer then a text parsers as it will handle types that depends on architecture, compiler flags, preprocessor macros, etc.

I guess the code to convert to and from protobuf also could be generated from the struct member to message field relation, but does not sound easy.

sfrank
  • 121
  • 1
  • 5
1

The question brought up is the age old challenge with "C" (and C++) code - No easy (or standard) way to reflect on c "struct" (or classes). Just search stack overflow on C reflection, and you will see lot of unsuccessful attempts. My first advice will be NOT to try to build another solution (in python, etc.).

One simple approach: Consider using gdb ptype to get structured output for you structures, which you can use to create the .proto file. The advantage is that there is no need to handle the full syntax of the C language (#define, line breaks, ...). See How do I show what fields a struct has in GDB?

From the gdb ptype, it's a short trip to protobuf '.proto' file.

You can get similar result from libCLang (and I believe there is comparable gcc plugin, but I can not locate it). However, you will have to write some non-trivial "C" code.

Another approach - will be to use 'swig' (https://www.swig.org), and process the swig xml output (or the -xmlout option) to dump the parse tree into XML. While this approach will require a little bit of digging to locate the structure that are needed, the information in XML format is complete, easy to parse (using whatever XML parser you want - python, perl). If you are brave enough, you can use xslt to generate the output.

dash-o
  • 13,723
  • 1
  • 10
  • 37
0

I would not parse the C source code myself, instead I would use the LibClang to parse C files into an AST and my own AST walker to generate the Protobuf and the transcoders as necessary. Googling for "libclang walk AST" should give something to start with, like ast-walker.cc and ast-dumper.cc from this github repository, for example.

bobah
  • 18,364
  • 2
  • 37
  • 70