How to dynamically build a new protobuf from a set of already defined descriptors?

Question

At my server, we receive Self Described Messages (as defined here... which btw wasn't all that easy as there aren't any 'good' examples of this in c++).

At this point I am having no issue creating messages from these self-described ones. I can take the FileDescriptorSet, go through each FileDescriptorProto, adding each to a DescriptorPool (using BuildFile, which also gives me every defined FileDescriptor).

From here I can create any of the messages which were defined in the FileDescriptorSet with a DynamicMessageFactory instanced with the DP and calling GetPrototype (which is very easy to do as our SelfDescribedMessage required the messages full_name() and thus we can call the FindMessageTypeByName method of the DP, giving us the properly encoded Message Prototype).

The question is how can I take each already defined Descriptor or message and dynamically BUILD a 'master' message that contains all of the defined messages as nested messages. This would primarily be used for saving the current state of the messages. Currently we're handling this by just instancing a type of each message in the server(to keep a central state across different programs). But when we want to 'save off' the current state, we're forced to stream them to disk as defined here. They're streamed one message at a time (with a size prefix). We'd like to have ONE message (one to rule them all) instead of the steady stream of separate messages. This can be used for other things once it is worked out (network based shared state with optimized and easy serialization)

Since we already have the cross-linked and defined Descriptors, one would think there would be an easy way to build 'new' messages from those already defined ones. So far the solution has alluded us. We've tried creating our own DescriptorProto and adding new fields of the type from our already defined Descriptors but got lost (haven't deep dived into this one yet). We've also looked at possibly adding them as extensions (unknown at this time how to do so). Do we need to create our own DescriptorDatabase (also unknown at this time how to do so)?

Any insights?

Linked example source on BitBucket.

Hopefully this explanation will help.

I am attempting to dynamically build a Message from a set of already defined Messages. The set of already defined messages are created by using the "self-described" method explained(briefly) in the official c++ protobuf tutorial (i.e. these messages not available in compiled form). This newly defined message will need to be created at runtime.

Have tried using the straight Descriptors for each message and attempted to build a FileDescriptorProto. Have tried looking at the DatabaseDescriptor methods. Both with no luck. Currently attempting to add these defined messages as an extension to another message (even tho at compile time those defined messages, and their 'descriptor-set' were not classified as extending anything) which is where the example code starts.

wow... not even a comment... Here's where I'm at so far. This is the only one I have public source ... it obviously doesn't compile right now (all good up until the very end where the ExtensionSet is first created)... Trying to go the extensions route at the moment as the other two have failed me as of yet. http://goo.gl/VJhnk — g19fanatic, Aug 22 '12 at 03:26
The issue that I am having at the moment is in the initialization of the Extension Identifier. I need a class to point for the MessageTypeTraits to that of one that describes the message type (might have to do my own templateing magic?) but have been unsuccessful as of yet... — g19fanatic, Aug 22 '12 at 03:29
Honestly, I read your question 3 times, and still fail to understand what are you describing. I think this happens to most readers, that's why you didn't get a reply. You do need to simplify stuff. Also, it really feels like you're building something overcomplicated, where a much easier solution is possible. — Codeguard, Aug 22 '12 at 12:38
The description is basically as simple as it can get. The now included source is also a very stripped down version of what I am describing. I am certainly interested in suggestions towards an easier solution. — g19fanatic, Aug 22 '12 at 13:03
once you have deserialized a message using the prototype, what can you usefully do with it? A big if-else if-else if construction to figure out how to cast it back to the original type? — Managu, Aug 22 '12 at 15:12
we use it to keep state across different programs (shared IPC of sorts). This allows us to support any number of client programs without having to know the memory structure before hand. Each client 'registers' messages that it would like to receive/send and we update the current state from the client programs. The issue is when we are serializing the messages for storage or parsing them for state retrieval between sessions. We are able to do it serially like we currently are, just need to improve it to one defined message. — g19fanatic, Aug 22 '12 at 16:00
I would think your first approach (*"creating our own DescriptorProto and adding new fields of the type from our already defined Descriptors"*) would be the easiest. Can the `DescriptorPool` or the `DynamicMessageFactory` give you a list of messages? — Beta, Aug 22 '12 at 17:20
@Beta, this is the method that I was trying to do at first but think I was missing a small key piece... Any code suggestions? — g19fanatic, Aug 23 '12 at 20:21
I asked a [question on google groups](https://groups.google.com/forum/?fromgroups=#!topic/protobuf/4HHLcMy9EAQ) regarding this problem. If anyone can put together some example code within the next 22 hours of the suggested solution, I will award the bounty to that person. The answer that is below is not helpful. — g19fanatic, Aug 26 '12 at 01:53
@g19fanatic: Can you point me to a tutorial how-to use self-describing messages or explain to me how-to use a `FileDescriptorSet` to retrieve the comments for a message from a .proto file? I don't get it after reading the official document. I think it is the correct approach to solve my problem described here: http://stackoverflow.com/questions/32742601/reading-comments-from-proto-files-using-a-protocol-buffers-descriptor-object — Florian Wolters, Sep 24 '15 at 11:41

score 6 · Answer 1 · edited Apr 17 '14 at 01:38

you need a protobuf::DynamicMessageFactory:

{
  using namespace google;

  protobuf::DynamicMessageFactory dmf;
  protobuf::Message* actual_msg = dmf.GetPrototype(some_desc)->New();

  const protobuf::Reflection* refl = actual_msg->GetReflection();

  const protobuf::FieldDescriptor* fd = trip_desc->FindFieldByName("someField");
  refl->SetString(actual_msg, fd, "whee");

  ... 

  cout << actual_msg->DebugString() << endl;
}

score 5 · Accepted Answer · answered Aug 22 '12 at 18:09

I was able to solve this problem by dynamically creating a .proto file and loading it with an Importer.

The only requirement is for each client to either send across its proto file (only needed at init... not during full execution). The server then saves each proto file to a temp directory. An alternative if possible is to just point the server to a central location that holds all of the needed proto files.

This was done by first using a DiskSourceTree to map actual path locations to in program virtual ones. Then building the .proto file to import every proto file that was sent across AND define an optional field in a 'master message'.

After the master.proto has been saved to disk, i Import it with the Importer. Now using the Importers DescriptorPool and a DynamicMessageFactory, I'm able to reliably generate the whole message under one message. I will be putting an example of what I am describing up later on tonight or tomorrow.

If anyone has any suggestions on how to make this process better or how to do it different, please say so.

I will be leaving this question unanswered up until the bounty is about to expire just in case someone else has a better solution.

Do you have an example of how you implemented this? – Dave Nov 19 '13 at 12:58 — Dave, Nov 19 '13 at 12:58

score 1 · Answer 3 · answered Aug 22 '12 at 14:16

1

What about serializing all the messages into strings, and making the master message a sequence of (byte) strings, a la

message MessageSet
{
  required FileDescriptorSet proto_files = 1;
  repeated bytes serialized_sub_message = 2;
}

answered Aug 22 '12 at 14:16

Managu

8,849
2
30
36

@g19fanatic: If this isn't along the lines you're looking for, could you clarify what you're trying to achieve that this doesn't do? – Managu Aug 22 '12 at 14:19
This is exactly the way that we are currently doing it, but the problem lies in parsing that set of messages back out quickly. Because there are no inherent delimiters between messages, we store the size of each message in an uint32 prefix. We use this prefix to individually parse out each message. What we're trying to do is have just one message that we then serialize. When its time to parse it out, its just one call instead of the repetitive getnextsize, parse next message, repeat. – g19fanatic Aug 22 '12 at 14:28
2

Right, so hand the encoding part off to protocol buffers, with `serialized_sub_message` being `repeated`. So it's still a loop (`dispatch_serialized_message(message_set.serialized_sub_message(i))`), but you don't have to deal with the details of encoding to the wire. – Managu Aug 22 '12 at 14:46
I haven't thought about this for our current method and this will handle the problem of needing the details, thank you. But I will still have to parse each message instead of just parsing one. – g19fanatic Aug 22 '12 at 14:52
This is slow solution, you have to deserialize twice. See how protobuf polymorphism is achieved: http://www.indelible.org/ink/protobuf-polymorphism/ – omikron Apr 03 '17 at 11:39

How to dynamically build a new protobuf from a set of already defined descriptors?

3 Answers3