220

I need to specify a message with an optional field in protobuf (proto3 syntax). In terms of proto 2 syntax, the message I want to express is something like:

message Foo {
    required int32 bar = 1;
    optional int32 baz = 2;
}

From my understanding "optional" concept has been removed from syntax proto 3 (along with required concept). Though it is not clear the alternative - using the default value to state that a field has not been specified from the sender, leaves an ambiguity if the default value belongs to the valid values domain (consider for example a boolean type).

So, how am I supposed to encode the message above? Thank you.

MaxP
  • 2,664
  • 2
  • 15
  • 16
  • Is the approach below a sound solution? message NoBaz { } message Foo { int32 bar = 1; oneof baz { NoBaz undefined = 2; int32 defined = 3; }; } – MaxP Mar 06 '17 at 09:57
  • 2
    There's [a Proto 2 version of this question](https://stackoverflow.com/questions/9184215/whats-the-preferred-way-to-encode-a-nullable-field-in-protobuf-2), if others find this but are using Proto 2. – chwarr Feb 14 '19 at 01:13
  • 3
    proto3 basically makes all fields optional. However, for scalars, they made it impossible to distinguish between "field not set" and "field set but to default value." If you wrap your scalar in a singleton oneof e.g. - message blah { oneof v1 { int32 foo = 1; } }, then you can check again whether or not foo was actually set or not. For Python at least, you can operate directly on foo as if it wasn't inside a oneof and you can ask HasField("foo"). – jschultz410 Feb 21 '20 at 15:38
  • 2
    @MaxP Maybe you could change the accepted answer to https://stackoverflow.com/a/62566052/66465 since a newer version of protobuf 3 now has `optional` – SebastianK Oct 03 '20 at 10:17

9 Answers9

240

Since protobuf release 3.15, proto3 supports using the optional keyword (just as in proto2) to give a scalar field presence information.

syntax = "proto3";

message Foo {
    int32 bar = 1;
    optional int32 baz = 2;
}

A has_baz()/hasBaz() method is generated for the optional field above, just as it was in proto2.

Under the hood, protoc effectively treats an optional field as if it were declared using a oneof wrapper, as CyberSnoopy’s answer suggested:

message Foo {
    int32 bar = 1;
    oneof optional_baz {
        int32 baz = 2;
    }
}

If you’ve already used that approach, you can now simplify your message declarations (switch from oneof to optional) and code, since the wire format is the same.

The nitty-gritty details about field presence and optional in proto3 can be found in the Application note: Field presence doc.

Historical note: Experimental support for optional in proto3 was first announced on Apr 23, 2020 in this comment. Using it required passing protoc the --experimental_allow_proto3_optional flag in releases 3.12-3.14.

jaredjacobs
  • 5,625
  • 2
  • 27
  • 23
182

In proto3, all fields are "optional" (in that it is not an error if the sender fails to set them). But, fields are no longer "nullable", in that there's no way to tell the difference between a field being explicitly set to its default value vs. not having been set at all.

If you need a "null" state (and there is no out-of-range value that you can use for this) then you will instead need to encode this as a separate field. For instance, you could do:

message Foo {
  bool has_baz = 1;  // always set this to "true" when using baz
  int32 baz = 2;
}

Alternatively, you could use oneof:

message Foo {
  oneof baz {
    bool baz_null = 1;  // always set this to "true" when null
    int32 baz_value = 2;
  }
}

The oneof version is more explicit and more efficient on the wire but requires understanding how oneof values work.

Finally, another perfectly reasonable option is to stick with proto2. Proto2 is not deprecated, and in fact many projects (including inside Google) very much depend on proto2 features which are removed in proto3, hence they will likely never switch. So, it's safe to keep using it for the foreseeable future.

Kenton Varda
  • 41,353
  • 8
  • 121
  • 105
  • Similar to your solution, in my comment, I proposed to use the oneof with the real value and a null type (an empty message). This way you don't bother with the boolean value (which should be not relevant, because if there is the boolean, then there is no baz_value) Correct? – MaxP Mar 07 '17 at 12:37
  • 3
    @MaxP Your solution works but I would recommend a boolean over an empty message. Either will take two bytes on the wire but the empty message will take considerably more CPU, RAM, and generated code bloat to handle. – Kenton Varda Mar 07 '17 at 19:30
  • 18
    I find out message Foo { oneof baz { int32 baz_value = 1; } } works pretty well. – Terry Shi Aug 04 '17 at 15:08
  • @CyberSnoopy Can you post it as an answer? Your solution works perfect and elegant. – Cheng Chen Oct 15 '17 at 10:35
  • @CyberSnoopy Have you by chance ran into any issues when sending response message that is structured something like: message FooList { repeated Foo foos = 1; } ? Your solution is great but I'm having trouble sending FooList as server response now. – CaffeinateOften Mar 04 '18 at 00:19
  • This answer gives the impression that it is not possible to differentiate missing/default __message__ fields. This is only true with [scalar](https://developers.google.com/protocol-buffers/docs/proto3#scalar) fields. – George Leung Jul 16 '19 at 09:19
  • The bool has_baz approach is pretty terrible. It adds an extra field that has to be serialized and manually, rather than automatically, maintained, which is highly error prone. The oneof approach is better, but then there is no need for baz_null. Just have a singleton oneof containing the scalar you are after. Most generated code (e.g. - Python) will allow you to directly operate on the scalar as if it wasn't inside a oneof, but it will also allow you to explicitly check through HasField for presence / abscence if you want. – jschultz410 Feb 21 '20 at 15:21
  • After further consideration, the oneof with an extra explicit field to indicate null can make good sense if you don't want to interpret field absence as null, but rather want the ability to explicitly set something to null or ignore it if it isn't specified. In this case, you probably want to use google.protobuf.NullValue rather than a bool. That type is an enum that can only have one value (NULL_VALUE == 0), so you don't have the confusion of does true/false mean anything. Additionally, that type is a well known type and different languages may make it friendlier to use. – jschultz410 Feb 28 '20 at 16:55
  • Just use `optional int32 something = 1;`. Most answers here are obsolete. It just works in protofile3. – lukyer Jan 07 '22 at 09:49
  • See https://stackoverflow.com/a/70619416/1977799 – lukyer Jan 07 '22 at 13:32
150

One way is to optional like described in the accepted answer: https://stackoverflow.com/a/62566052/1803821

Another one is to use wrapper objects. You don't need to write them yourself as google already provides them:

At the top of your .proto file add this import:

import "google/protobuf/wrappers.proto";

Now you can use special wrappers for every simple type:

DoubleValue
FloatValue
Int64Value
UInt64Value
Int32Value
UInt32Value
BoolValue
StringValue
BytesValue

So to answer the original question a usage of such a wrapper could be like this:

message Foo {
    int32 bar = 1;
    google.protobuf.Int32Value baz = 2;
}

Now for example in Java I can do stuff like:

if(foo.hasBaz()) { ... }

VM4
  • 6,321
  • 5
  • 37
  • 51
  • 4
    How does this work? When `baz=null` and when `baz` is not passed, both cases `hasBaz()` says `false`! – mayankcpdixit Dec 03 '18 at 12:43
  • 1
    The idea is simple: you use wrapper objects or in other words user defined types. These wrapper objects are allowed to be missing. The Java example I provided worked well for me when working with gRPC. – VM4 Jan 22 '19 at 16:10
  • Yeah! I understand the general idea, But I wanted to see it in action. What I don't understand is: (even in wrapper object) "*How to identify missing and null wrapper values?*" – mayankcpdixit Jan 24 '19 at 06:23
  • Because Int32Value will set the value to default instance if your value in proto message is null. You can try: this will be TRUE if your value is null: - if(request.getSomeInt32Value() == Int32Value.getDefaultInstance()) – Mário Kapusta May 15 '19 at 23:29
  • 6
    This is the way to go. With C#, the generated code produces Nullable properties. – Aaron Hudon Jun 18 '19 at 16:25
  • 6
    Better than the original awsner! – Dev Aggarwal Nov 06 '19 at 15:48
  • The oneof in the accepted answer isn't great. There's no need for an additional baz_null field in the oneof. The WKT wrappers are ugly because now you can't directly operate on the scalars. Instead, you have to jump through a submessage. It also requires more serialization. Also, for some bizarre reason, there aren't WKT wrapper types for all the basic scalar types!!! The singleton oneof approach gets around most of these problems for most languages. For Python, you can operate directly on the scalar as if it wasn't in a oneof and you can ask HasField(). The .proto's are uglier though. – jschultz410 Feb 21 '20 at 15:26
  • In the current javadoc for `DoubleValue.Builder`, I don't see a setter that can take a Java `Double` and handle it being null or having a value. Is this an oversight in Google's design in your opinion? It would seem a strong advantage of wrappers would be consuming a target language's primitive wrappers for setting values. I'd rather not have to handle figuring out if something is null before setting a field. – Ungeheuer Sep 09 '20 at 04:38
  • 1
    Why does it add a `has` function for the wrapper and not for the intrinsic type? – madhat1 Jan 14 '21 at 08:01
  • @madhat1 Because `has` functions are only provided for message fields. Using this technique, scalars where you want to be able to tell the difference between "doesn't have one" and "has one and it's the default value" are wrapped in messages. That way, you can first call `hasX` to find out whether the field is set. Sometimes that has meaning to your app. And only after that function returns `true`, you can call `getX` to get the message field and then `getValue` on that wrapper to get the scalar's value. – Matt Welke Jun 28 '21 at 03:35
  • using wrappers looks weird for this purpose. very unobvious – anatol Mar 29 '23 at 01:11
44

Based on Kenton's answer, a simpler yet working solution looks like:

message Foo {
    oneof optional_baz { // "optional_" prefix here just serves as an indicator, not keyword in proto2
        int32 baz = 1;
    }
}
Terry Shi
  • 968
  • 1
  • 10
  • 14
  • how does this embody the optional-character? – JFFIGK Aug 10 '18 at 12:27
  • 27
    Basically, oneof is poorly named. It means "at most one of". There's always a possible null value. – ecl3ctic Sep 28 '18 at 06:05
  • If left unset the value case will be `None` (in C#) - see the enum-type for the language of your choice. – nitzel Jan 11 '19 at 16:21
  • 1
    Yes, this is probably the best way to skin this cat in proto3 -- even if it does make the .proto a bit ugly. – jschultz410 Feb 21 '20 at 15:33
  • However, it does somewhat imply that you may interpret the absence of a field as explicitly setting it to the null value. In other words, there is some ambiguity between 'optional field not specified' and 'field wasn't specified intentionally to mean it is null'. If you care about that level of precision, then you can add an additional google.protobuf.NullValue field to the oneof that allows you to distinguish between 'field not specified', 'field specified as value X' and 'field specified as null'. It's kind of fugly, but that's because proto3 doesn't support null directly like JSON does. – jschultz410 Feb 28 '20 at 17:02
8

To expand on @cybersnoopy 's suggestion here

if you had a .proto file with a message like so:

message Request {
    oneof option {
        int64 option_value = 1;
    }
}

You can make use of the case options provided (java generated code):

So we can now write some code as follows:

Request.OptionCase optionCase = request.getOptionCase();
OptionCase optionNotSet = OPTION_NOT_SET;

if (optionNotSet.equals(optionCase)){
    // value not set
} else {
    // value set
}
Benjamin Slabbert
  • 511
  • 2
  • 9
  • 9
  • In Python it's even simpler. You can just do request.HasField("option_value"). Also, if you have a bunch of singleton oneof's like this inside your message then you can access their contained scalars directly just like a normal scalar. – jschultz410 Feb 20 '20 at 21:31
7

Just use:

syntax = "proto3";

message Hello {
    int64 required_id = 1;
    optional int64 optional_id = 2;
}

In Go it builds struct with

type Hello struct {
   ...
   RequiredId int64 ...
   OptionalId *int64 ...
   ...
}

You can easily check for nil and distinguish between default value (zero) and unset value (nil).

Most answers here are obsolete and unnecessarily complicated.

lukyer
  • 7,595
  • 3
  • 37
  • 31
  • Thanks. Happily, the top answer also got updated in February: https://stackoverflow.com/a/62566052/388236 – Hugo Sep 08 '22 at 10:40
1

Another way to encode the message you intend is to add another field to track "set" fields:

syntax="proto3";

package qtprotobuf.examples;

message SparseMessage {
    repeated uint32 fieldsUsed = 1;
    bool   attendedParty = 2;
    uint32 numberOfKids  = 3;
    string nickName      = 4;
}

message ExplicitMessage {
    enum PARTY_STATUS {ATTENDED=0; DIDNT_ATTEND=1; DIDNT_ASK=2;};
    PARTY_STATUS attendedParty = 1;
    bool   indicatedKids = 2;
    uint32 numberOfKids  = 3;
    enum NO_NICK_STATUS {HAS_NO_NICKNAME=0; WOULD_NOT_ADMIT_TO_HAVING_HAD_NICKNAME=1;};
    NO_NICK_STATUS noNickStatus = 4;
    string nickName      = 5;
}

This is especially appropriate if there is a large number of fields and only a small number of them have been assigned.

In python, usage would look like this:

import field_enum_example_pb2
m = field_enum_example_pb2.SparseMessage()
m.attendedParty = True
m.fieldsUsed.append(field_enum_example_pb2.SparseMessages.ATTENDEDPARTY_FIELD_NUMBER)
user8819
  • 53
  • 6
-2

Another way is that you can use bitmask for each optional field. and set those bits if values are set and reset those bits which values are not set

enum bitsV {
    baz_present = 1; // 0x01
    baz1_present = 2; // 0x02

}
message Foo {
    uint32 bitMask;
    required int32 bar = 1;
    optional int32 baz = 2;
    optional int32 baz1 = 3;
}

On parsing check for value of bitMask.

if (bitMask & baz_present)
    baz is present

if (bitMask & baz1_present)
    baz1 is present
ChauhanTs
  • 439
  • 3
  • 6
-3

you can find if one has been initialized by comparing the references with the default instance:

GRPCContainer container = myGrpcResponseBean.getContainer();
if (container.getDefaultInstanceForType() != container) {
...
}
eduyayo
  • 2,020
  • 2
  • 15
  • 35
  • 1
    This is not a good general approach because very often the default value is a perfectly acceptable value for the field and in that situation you can't distinguish between "field absent" and "field present but set to default." – jschultz410 Feb 21 '20 at 15:30