13

The current design

I am refactoring some exiting API code that returns a feed of events for a user. The API is a normal RESTful API, and the current implementation simply queries a DB and returns a feed.

The code is long and cumbersome, so I've decided to move the feed generation to a microservice that will be called from the API server.

The new design

For the sake of decoupling, I thought that the data may move back and forth from the API server to the microservice as Protobuf objects. This way, I can change the programming language on either end and still enjoy the type safety and slim size of protobuf.

enter image description here

The problem

The feed contains multiple types (e.g. likes, images and voice messages). In the future, new types can be added. They all share a few properties timestamp and title, for instance - but other than that they might be completely different.

In classic OOP, the solution is simple - a base FeedItem class from which all feed items inherit, and a Feed class which contains a sequence of FeedItem classes.

How do I express the notion of Polymorphism in Protocol Buffers 3, or at least enable different types of messages in a list?

What have I checked

  • Oneof: "A oneof cannot be repeated".
  • Any: Too broad (like Java's List<Object>.
Adam Matan
  • 128,757
  • 147
  • 397
  • 562
  • 1
    You can repeat a oneof by putting it inside a repeated submessage. – jpa Nov 14 '16 at 12:47
  • 3
    There was a thread about this on the protobuf mailing list recently: https://groups.google.com/d/msg/protobuf/ojpYHqx2l04/bfyAhqBxAQAJ I think this is a common question and the usual solution is that you should take the common data and put that into a message that the different types can all just include as a submessage. – Adam Cozzette Nov 14 '16 at 18:09
  • @AdamCozzette Great, that's what I was looking for. It seems that we can't do an better than that. Care to re-write the gist of the thread as an answer (which I'd love to accept), or do you want me to do it? – Adam Matan Nov 15 '16 at 11:20
  • I'm a little busy today so if you could do it that would be great! – Adam Cozzette Nov 17 '16 at 16:36
  • Especially the handling is interesting to me. How to avoid switch-cases without inheritance and without being able to "peek" into the message upfront? –  Feb 02 '17 at 18:58

1 Answers1

2

The answer for serialization protocols is to use discriminator based polymorphism. Traditional Object Oriented inheritance is a form of that with some very bad characteristics. In newer protocols like OpenAPI the concept is a bit cleaner.

Let me explain how this works with proto3

First you need to declare your polymorphic types. Suppose we go for the classic animal species problem where different species have different properties. We first need to define a root type for all animals that will identify the species. Then we declare a Cat and Dog messages that extend the base type. Note that the discriminator species is projected in all 3:

 message BaseAnimal {
   string species = 1;
 }

 message Cat {
   string species = 1;
   string coloring = 10;
 }

 message Dog {
   string species = 1;
   int64 weight = 10;
 }

Here is a simple Java test to demonstrate how things work in practice

    ByteArrayOutputStream os = new ByteArrayOutputStream(1024);

    // Create a cat we want to persist or send over the wire
    Cat cat = Cat.newBuilder().setSpecies("CAT").setColoring("spotted")
            .build();

    // Since our transport or database works for animals we need to "cast"
    // or rather convert the cat to BaseAnimal
    cat.writeTo(os);
    byte[] catSerialized = os.toByteArray();
    BaseAnimal forWire = BaseAnimal.parseFrom(catSerialized);
    // Let's assert before we serialize that the species of the cat is
    // preserved
    assertEquals("CAT", forWire.getSpecies());

    // Here is the BaseAnimal serialization code we can share for all
    // animals
    os = new ByteArrayOutputStream(1024);
    forWire.writeTo(os);
    byte[] wireData = os.toByteArray();

    // Here we read back the animal from the wire data
    BaseAnimal fromWire = BaseAnimal.parseFrom(wireData);
    // If the animal is a cat then we need to read it again as a cat and
    // process the cat going forward
    assertEquals("CAT", fromWire.getSpecies());
    Cat deserializedCat = Cat.parseFrom(wireData);

    // Check that our cat has come in tact out of the serialization
    // infrastructure
    assertEquals("CAT", deserializedCat.getSpecies());
    assertEquals("spotted", deserializedCat.getColoring());

The whole trick is that proto3 bindings preserve properties they do not understand and serialize them as needed. In this way one can implement a proto3 cast (convert) that changes the type of an object without loosing data.

Note that the "proto3 cast" is very unsafe operation and should only be applied after proper checks for the discriminator are made. You can cast a cat to a dog without a problem in my example. The code below fails

    try {
        Dog d = Dog.parseFrom(wireData);
        fail();
    } catch(Exception e) {
        // All is fine cat cannot be cast to dog
    }

When property types at same index match it is possible that there will be semantic errors. In the example I have where index 10 is int64 in dog or string in cat proto3 treats them as different fields as their type code on the wire differs. In some cases where type may be string and a structure proto3 may actually throw some exceptions or produce complete garbage.

Kiril
  • 1,028
  • 1
  • 9
  • 23
  • Traditional inheritance is bad for many reasons and this is why proto3 and newer serialization protocols after XML do not support it. Think of a simple problem with versions where the serializing side may be upgraded to use new more concrete exception type e.g. `FileNotFound` that extends from `IOError`. The reader of the message may know how to `IOError` but finding only `FileNotFound` on the wire does not tell the reader this is a descendant of `IOError` and hence the reader will process this as generic unknown exception possibly causing a bug in the system. There are few similar issues. – Kiril Mar 04 '19 at 10:51