104

I have an object that I de-serialize using protobuf in Python. When I print the object it looks like a python object, however when I try to convert it to json I have all sorts of problems.

For example, if I use json.dumps() I get that the object (the generated code from protoc) does not contain a _ dict _ error.

If I use jsonpickle I get UnicodeDecodeError: 'utf8' codec can't decode byte 0x9d in position 97: invalid start byte.

Test code below is using jsonpickle with the error shown above.

if len(sys.argv) < 2:
    print ("Error: missing ser file")
    sys.exit()
else :
    fileLocation = sys.argv[1]

org = BuildOrgObject(fileLocation) 

org = org.Deserialize()


#print (org)
jsonObj = jsonpickle.encode(org)
print (jsonObj)
Seanny123
  • 8,776
  • 13
  • 68
  • 124
exHash
  • 1,165
  • 2
  • 8
  • 12
  • 2
    This would be way easier to figure out if you showed us the relevant parts of your .proto file and the implementation of BuildOrgObject(). If we can reproduce the behavior you're seeing, it's much easier for us to figure out what's wrong. – Sam Mussmann Nov 01 '13 at 20:59

5 Answers5

228

I'd recommend using protobuf↔json converters from google's protobuf library:

from google.protobuf.json_format import MessageToJson

json_obj = MessageToJson(org)

You can also serialise the protobuf to a dictionary:

from google.protobuf.json_format import MessageToDict
dict_obj = MessageToDict(org)

Refer to the protobuf package API documentation: https://developers.google.com/protocol-buffers/docs/reference/python/ (see module google.protobuf.json_format).

Boris Verkhovskiy
  • 14,854
  • 11
  • 100
  • 103
denis-sumin
  • 2,401
  • 1
  • 12
  • 6
  • 1
    This doesn't seem to be available in version `3.2`? – Seanny123 Feb 09 '17 at 07:23
  • 1
    I've done a quick test (clean venv, install protobuf, try to import MessageToJson) and it seems to be available. Python 3.6. – denis-sumin Feb 09 '17 at 21:31
  • google.protobuf.json_format comes to be supported since proto2.7. – Zheng Qsin Jun 17 '17 at 10:23
  • 2.7? I only know of 2.6.1 to be the latest release https://pypi.python.org/pypi/protobuf – binaryguy Jul 03 '17 at 08:00
  • 1
    The bytes field converts to base64 string with this, anyway to go around that? – swateek Mar 19 '18 at 07:05
  • 27
    `MessageToJson(org, preserving_proto_field_name=True)` if you don't want your `field_name` converted into `fieldName` – redacted Apr 09 '18 at 12:35
  • 17
    It doesn't work for me. I get the error: AttributeError: 'Schema' object has no attribute 'DESCRIPTOR' – Zioalex May 16 '19 at 14:11
  • same for me @Alex have u found the solution ? – calvin sugianto Jul 10 '19 at 16:32
  • @Alex did you find a solution? – REVOLUTION Nov 26 '19 at 17:56
  • Nope:-( unfortunately I am not anymore on it. – Zioalex Nov 30 '19 at 08:27
  • 1
    @RobinNemeth, `fieldName` *should* be converted to `field_name` when converting to JSON with Protobuf 3. The spec defines a strict mapping between Protobuf 3 messages and JSON. It's part of the spec! – edam Jan 13 '20 at 11:27
  • 3
    If you are running this on a repeated sub field (instead of on a regular message) then it will fail with missing DESCRIPTOR. Use the main message, or convert to dict each of the elements and combine – Assaf Mendelson May 17 '20 at 12:58
  • I have the same error AttributeError: 'DESCRIPTOR', someone fixed the error? – Arsenio Aguirre Feb 25 '21 at 16:03
  • Have you tried `BuildOrgObject.to_json(org)`? See https://github.com/googleapis/python-vision/issues/64#issuecomment-712617747 – bryant1410 May 30 '21 at 05:40
  • Does anyone know why MessageToDict converts fixed64 fields in protobuf message to strings in the Dict ? Is there a way to override this behavior ? For me this conversion is posing problems in using this API – user2896235 Sep 30 '21 at 16:09
  • Works for me. Make sure you encode the data before sending. json_obj = MessageToJson(al) data = json_obj data = data.encode("utf-8") – xis10z Nov 12 '21 at 04:21
  • `json_obj` is of type `str` so i think name as `json_str` is less misleading. – Lei Yang Apr 19 '22 at 10:46
  • I got it working with the following approaches: 1) `json_string = type(response).to_json(response)` 2) ```import proto json_string = proto.Message.to_json(response)``` 3) `MessageToJson(response._pb)` Reference: https://stackoverflow.com/questions/64403737/attribute-error-descriptor-while-trying-to-convert-google-vision-response-to-dic – Andre Wisplinghoff Jun 28 '23 at 11:15
17

If you need to go straight to json take a look at the protobuf-to-json library, but you'll have to install that manually.

But I would recommend that you use the protobuf-to-dict library instead for a few reasons:

  1. It is accessible from pypi so you can simply pip install protobuf-to-dict or include it in a requirements.txt
  2. dict can be converted to json and might be more useful than a json string
Seanny123
  • 8,776
  • 13
  • 68
  • 124
Kevin Hill
  • 473
  • 5
  • 16
  • `protobuf-to-dict` feels more pythonic than `google.protobuf.json_format` – nmurthy Sep 14 '17 at 14:56
  • 15
    There is now a method include with protobuf for converting to a dict instead of json. It is called: `google.protobuf.json_format.MessageToDict` – kgreenek Feb 23 '18 at 01:28
  • 2
    Requires implementing MyMessage to get things going. Kind of incomplete since thats a big part of parsing a proto? – Paul Kenjora Mar 15 '18 at 05:14
  • 1
    That's a limitation of protobuf itself. Unlike something like json, a protobuf message cannot describe itself, and a schema must be created in order to either encode or decode a message. Without a schema, it is just a byte blob without enough information to parse. – Kevin Hill May 03 '18 at 15:29
  • I tried using protobuf_to_dict on python 3 and got this error: `File ...protobuf_to_dict.py", line 15, in FieldDescriptor.TYPE_INT64: long,` I assume this means it only works for python 2.7? – Stian May 07 '18 at 08:24
  • 1
    @Stian - try the updated version: pip install protobuf3-to-dict – lawson Feb 12 '19 at 20:14
3

Here's my function to convert a proto3 object to a JSON object (i.e. Python dictionary):

def protobuf_to_dict(proto_obj):
    key_list = proto_obj.DESCRIPTOR.fields_by_name.keys()
    d = {}
    for key in key_list:
        d[key] = getattr(proto_obj, key)
    return d

Since the converters from Google's protobuf library don't seem to work in some cases with the 3.19 version, this function leverages the Descriptor class present on each Protobuf object.

Here, getattr(obj, string_attribute) returns the value given by obj.attribute

Nirav
  • 71
  • 5
2

You can also user SerializeToString.

org.SerializeToString()
Simas Joneliunas
  • 2,890
  • 20
  • 28
  • 35
  • 1
    This does not return the JSON representation of the Protobuf message. It returns the Protobuf-encoded serialized messages as bytes (see [the documentation](https://developers.google.com/protocol-buffers/docs/pythontutorial#parsing-and-serialization)). – whatyouhide Jan 12 '22 at 09:37
0

If you are using an older version that doesn't has the preserving_proto_field_name field:

from google.protobuf.json_format import MessageToJson
def proto_to_json(proto_obj):
    json_obj = MessageToJson(proto_obj):
    json_obj = MessageToJso, including_default_value_fields=True)
    # Change lowerCamelCase of google Json conversion to the snake_case as in original protobuf
    dict_obj = dict((re.sub(r'(?<!^)(?=[A-Z])', '_', k).lower(),v) for k, v in json.loads(json_obj).items())
    if hasattr(proto_obj, 'uuid'):
        dict_obj["uuid"] = proto_obj.uuid.encode("hex")
    return json.dumps(dict_obj, indent=4, sort_keys=True)
vr286
  • 826
  • 7
  • 6