6

I have a lot of protobuf messages for which I currently use a manually written lookup function to generate the message by its name. Since the messages get more and more as the project evolves, I'm getting tired of maintaining this lookup code by hand.

So, is there a way to automate this process? Maybe with a protoc plugin that adds some code to the protobuf code so that it may register itself?

Marko
  • 446
  • 4
  • 17

3 Answers3

16

The C++ Protobuf library already maintains a pool of "descriptors" for all types compiled into the binary.

https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.descriptor#DescriptorPool.generated_pool.details

So, you could do:

google::protobuf::Descriptor* desc =
    google::protobuf::DescriptorPool::generated_pool()
        ->FindMessageTypeByName("mypkg.MyType");
assert(desc != NULL);

The library also maintains an object that can be used to construct instances of any compiled-in type, given its descriptor:

https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.message#MessageFactory.generated_factory.details

So you'd do:

google::protobuf::Message* message =
    google::protobuf::MessageFactory::generated_factory()
        ->GetPrototype(desc)->New();
pestophagous
  • 4,069
  • 3
  • 33
  • 42
Kenton Varda
  • 41,353
  • 8
  • 121
  • 105
  • As mentioned below, to make Protobuf reflection work the descriptors have to be loaded. This is typically done on first message creation. So in order to `FindMessageTypeByName` work one have to at least instantiate one message per namespace (protobuf package). Like this: `my_package::messages::SuccessMessage::default_instance()`. Then lookup works for `my_package::messages` namespace. – ph4r05 Jul 29 '18 at 18:31
  • 2
    @ph4r05 That's not true. All descriptors are indexed at startup time, in global constructors. I wrote the C++ protobuf implementation. – Kenton Varda Jul 31 '18 at 17:21
  • Interesting... I have code which clearly does not register any decriptors so `FindMessageTypeByName` fails for all messages from given protobuf `package` / namespace. I try to resolve the message on the startup, so no message was explicitly instantiated before. When exactly during the program lifecycle are descriptors being registered? I call `FindMessageTypeByName` from a different library (compared to where protobuf messages are) so maybe constructors are not called in that case? Or not visible from the context of calling lib? I had similar problem before... – ph4r05 Jul 31 '18 at 18:46
  • Could it be related to: https://stackoverflow.com/a/1271692/1378053 ? I.e., initialization can be deferred until first use of any function / object from the translation unit? – ph4r05 Jul 31 '18 at 18:52
  • @ph4r05 The standard allows global constructors to be deferred to the first entry of the translation unit, but in practice I believe on all common platforms they are invoked at executable load time. However, maybe it's possible that the executable itself -- if it is a shared library -- is lazily-loaded? I haven't experienced such behavior on Linux, but I could certainly imagine it differs from OS to OS or even based on linker flags. – Kenton Varda Aug 02 '18 at 08:49
  • @ph4r05 Oh, also, when you say you resolve "on startup" -- if this means during your own global constructors (before main), then certainly at that point the proto's global constructors may not have executed yet, but you can force them to execute by calling `default_instance()` or similar. Another common problem is when people manage to link two different copies of libprotobuf into their binary -- the two copies will have independent descriptor pools. – Kenton Varda Aug 02 '18 at 08:52
  • I am super, super glad I found this, because I was trying to find a way to avoid a giant if/elseif block in my code in order to handle my many message types, without reinventing the wheel. I thought surely the protobuf stuff already has something I can use, and here it is. FWIW, I also have to instantiate one of my classes derived from Message to get it to work (I am on CentOS 7). Also, in your code example above, shouldn't desc be const? – GriffithLea Jul 13 '21 at 22:29
2

Unlikely this approach can not be used as a generic way of creation of any message instance. A message type description appears in the generated_pool() only after a message of this type was instantiated at least once (e.g. at the moment of MyMessageType* msg = new MyMessageType()), thus FindMessageTypeByName never finds a message type of a message that has not been instantiated yet.

kk.
  • 3,747
  • 12
  • 36
  • 67
  • 2
    This isn't true. All compiled-in descriptors are indexed at startup time from global constructors. I wrote this code. Have you disabled global constructors somehow, or made them lazy? – Kenton Varda Jul 31 '18 at 17:23
0

i wanted to add a comment to one of the above replies but because i dont have sufficient reputation count, i am adding this as an answer. seek pardons for the same.

i am using protocol buffer 3.6.1 and i noticed some code in the generated .pb.cc files pertaining to what perhaps Kenton was pointing to.

namespace protobuf_foo_5fcp_5fplayer_5fcommon_5fevent_5ftypes_2eproto {
void InitDefaults() {
}
//...
//...

// Force AddDescriptors() to be called at dynamic initialization time.
struct StaticDescriptorInitializer {
  StaticDescriptorInitializer() {
    AddDescriptors();
  }
} static_descriptor_initializer;
}  // namespace protobuf_foo_5fcp_5fplayer_5fcommon_5fevent_5ftypes_2eproto

It seems that the global variable static_descriptor_initializer is never called. I found this by modifying the code as follows and verifying that the introduced message to cout was never invoked !

//...
//...

#include <iostream>

// Force AddDescriptors() to be called at dynamic initialization time.
struct StaticDescriptorInitializer {
  StaticDescriptorInitializer() {
    AddDescriptors();
    std::cout << "##################> DESCRIPTORS ADDED\n";
  }
} static_descriptor_initializer;

Now i guess i have to find out whether there is a option in g++ (which i am using) to cause the 'static_descriptor_initializer' to be constructed during the application start sequence.

ksridhar
  • 189
  • 2
  • 9