3

I am dealing with the protobuf protocol and I encounter the need of decoding messages of unknown fields and types. I know protoc --decode_raw does a good job at that (alas not precise but good enough).

I was thinking about running protoc --decode-raw in a shell and let Python read its content, parsing it as a dictionary, but I figure it is the last resort of implementation.

Is there a Pythonic method of implementing the same functionality?

toothpick
  • 113
  • 18

3 Answers3

1

Use blackboxprotobuf

pip install bbpb

It can parse protobuf buffers without proto file definitions

message, typedef = blackboxprotobuf.protobuf_to_json(data)

shluvme
  • 713
  • 7
  • 24
0

I have had the same need, but there is no official API for it as far as I've found. There are some internal methods, such as accessing msg._unknown_fields on an empty message and internal.decoder. But those are not part of the official API and vary between versions.

If the shell approach is too hacky, your best bet is implementing the decoding based on documentation with custom Python code.

jpa
  • 10,351
  • 1
  • 28
  • 45
0

From https://pypi.org/project/bbpb/

pip install bbpb

Demo code:

import blackboxprotobuf
import base64
import json
data = base64.b64decode('KglNb2RpZnkgTWU=')
message,typedef = blackboxprotobuf.protobuf_to_json(data)
print(message)
    # {
    #   "5": "Modify Me"
    # }
    
type(data) 
# bytes

Parse from a website content with content-type: application/x-protobuf and content-encoding: br

import requests
response = requests.get('http://www.example.com/some_protobuf_type_content')

with open('somefile.bin', 'wb') as f:
    f.write(response.content)
file = 'somefile.bin'
with open(file, 'rb') as fr:
    data = fr.read()

# or just use `data = response.content`  
message, typedef = blackboxprotobuf.protobuf_to_json(data)   
type(message) # str
msg_data = json.loads(message)
type(msg_data) # dict
Ferris
  • 5,325
  • 1
  • 14
  • 23