Why is the creation of Python protobuf messages so slow?

Question

Say I have a message defined in test.proto as:

message TestMessage {
    int64 id = 1;
    string title = 2;
    string subtitle = 3;
    string description = 4;
}

And I use protoc to convert it to Python like so:

protoc --python_out=. test.proto

timeit for PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python:

from test_pb2 import TestMessage

%%timeit
tm = TestMessage()
tm.id = 1
tm.title = 'test title'
tm.subtitle = 'test subtitle'
tm.description = 'this is a test description'

6.75 µs ± 152 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

timeit for PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=cpp:

1.6 µs ± 115 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Compare that to just a dict:

%%timeit
tm = dict(
    id=1,
    title='test title',
    subtitle='test subtitle',
    description='this is a test description'
)

308 ns ± 2.47 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

This is also only for one message. Protobuf cpp implementation is about 10.6µs for my full project.

Is there a way to make this faster? Perhaps compiling the output (test_pb2)?

Protocol buffers are widely-used, and pretty well-optimized already, so I doubt it. Also, you don't really "compile" a Python source file; you could use a different interpreter if you needed to (pypy, etc.). But in any case, do you have reason to believe that serialization specifically is a bottleneck in your application? — bnaecker, May 13 '20 at 23:32
@bnaecker I was thinking there might be a way to output c++ and call those messages from python by building with setup.py somehow. It's a bottleneck for me because I'm parsing millions of rows of data into proto messages and it's taking 15+ hours — Brendan Martin, May 14 '20 at 00:47
Do you mean write a C++ executable to do the serialization, and then call that from Python? If so, that would be more expensive than what you have (you need to get the data from Python to C++, which is...serialization, plus process overhead). Have you tried the standard tools for parallelizing CPU-bound work, like [`ProcessPoolExecutor`](https://docs.python.org/3/library/concurrent.futures.html?highlight=processpoolexecutor#concurrent.futures.ProcessPoolExecutor), [`joblib`](https://joblib.readthedocs.io/en/latest/) or similar? — bnaecker, May 14 '20 at 01:26
@bnaecker I found this example which might be what I'm looking for https://yz.mit.edu/wp/fast-native-c-protocol-buffers-from-python/ — Brendan Martin, May 14 '20 at 01:48

Why is the creation of Python protobuf messages so slow?

0 Answers0