13

My proto file is as follows:

syntax = "proto3";
option csharp_namespace = "Proto";

message FileListRequest {
    repeated File Files = 1;
}

message File {
    string Path = 1;
}

message ImageFile {
    File File = 1;
    Size Size = 2;
    bytes Content = 3;
}

message Size {
    int32 Width = 1;
    int32 Height = 2;
}

message SendNextFile {
    
}

I compile it with the following command:

protoc --proto_path=. -I . --python_out=..\..\python\Modules\PreloadingIteratorWrapper\ .\filelist.proto

This creates the following file:

# -*- coding: utf-8 -*-
# Generated by the protocol buffer compiler.  DO NOT EDIT!
# source: filelist.proto
"""Generated protocol buffer code."""
from google.protobuf.internal import builder as _builder
from google.protobuf import descriptor as _descriptor
from google.protobuf import descriptor_pool as _descriptor_pool
from google.protobuf import symbol_database as _symbol_database
# @@protoc_insertion_point(imports)

_sym_db = _symbol_database.Default()




DESCRIPTOR = _descriptor_pool.Default().AddSerializedFile(b'\n\x0e\x66ilelist.proto\"\'\n\x0f\x46ileListRequest\x12\x14\n\x05\x46iles\x18\x01 \x03(\x0b\x32\x05.File\"\x14\n\x04\x46ile\x12\x0c\n\x04Path\x18\x01 \x01(\t\"F\n\tImageFile\x12\x13\n\x04\x46ile\x18\x01 \x01(\x0b\x32\x05.File\x12\x13\n\x04Size\x18\x02 \x01(\x0b\x32\x05.Size\x12\x0f\n\x07\x43ontent\x18\x03 \x01(\x0c\"%\n\x04Size\x12\r\n\x05Width\x18\x01 \x01(\x05\x12\x0e\n\x06Height\x18\x02 \x01(\x05\"\x0e\n\x0cSendNextFileB\x08\xaa\x02\x05Protob\x06proto3')

_builder.BuildMessageAndEnumDescriptors(DESCRIPTOR, globals())
_builder.BuildTopDescriptorsAndMessages(DESCRIPTOR, 'filelist_pb2', globals())
if _descriptor._USE_C_DESCRIPTORS == False:

  DESCRIPTOR._options = None
  DESCRIPTOR._serialized_options = b'\252\002\005Proto'
  _FILELISTREQUEST._serialized_start=18
  _FILELISTREQUEST._serialized_end=57
  _FILE._serialized_start=59
  _FILE._serialized_end=79
  _IMAGEFILE._serialized_start=81
  _IMAGEFILE._serialized_end=151
  _SIZE._serialized_start=153
  _SIZE._serialized_end=190
  _SENDNEXTFILE._serialized_start=192
  _SENDNEXTFILE._serialized_end=206
# @@protoc_insertion_point(module_scope)

According to the documentation this file should contain a class for each message type, but it doesn't. Why?

Ian Newson
  • 7,679
  • 2
  • 47
  • 80
  • what happens if you import this file and use dir() on it? – Kurt Apr 21 '22 at 20:19
  • @Kurt `['DESCRIPTOR', 'File', 'FileListRequest', 'ImageFile', 'SendNextFile', 'Size', '_FILE', '_FILELISTREQUEST', '_IMAGEFILE', '_SENDNEXTFILE', '_SIZE', '__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', '_builder', '_descriptor', '_descriptor_pool', '_sym_db', '_symbol_database']`. It's looking like the tutorial is incorrect and the classes are generated at runtime: https://github.com/protocolbuffers/protobuf/issues/2150 – Ian Newson Apr 21 '22 at 20:21
  • ya that was my guess, you can see bits and pieces of your classes in that serialized string passed to DESCRIPTOR – Kurt Apr 21 '22 at 20:43
  • See [Python Gernerated Code](https://developers.google.com/protocol-buffers/docs/reference/python-generated) specifically "The Python Protocol Buffers implementation is a little different from C++ and Java. In Python, the compiler only outputs code to build descriptors for the generated classes, and a Python metaclass does the real work." – DazWilkin Apr 22 '22 at 02:18
  • By the way, the [style guide](https://developers.google.com/protocol-buffers/docs/style) recommends that field names be `snake_case` so, for example, `FileListRequest` (is correct) but its field definition would be `repeated File files = 1;` (lowercase `f` in `files`) – DazWilkin Apr 22 '22 at 03:10
  • 3
    @DazWilkin I understand that, but the documentation also specifically states: "and some mysteriously empty classes, one for each message type". This is absent in my generated code. – Ian Newson Apr 22 '22 at 13:27
  • 1
    The documentation is incorrect|outdated. Your code should work with that generated code. Interestingly, I used the [`grpcio-tools`](https://pypi.org/project/grpcio-tools/) to compile a repro of you question because this included (a version of) `protoc`. That generated class does not include "mysteriously empty classes" either but it is different again. – DazWilkin Apr 22 '22 at 15:40

2 Answers2

13

I have the same issue. Experimenting with grpc-tools (I found answer in another thread suggesting usage of this tool) I finally found a solution.

Just add to protoc command this arg: --pyi_out like:

protoc --proto_path=. -I . --python_out=..\..\python\Modules\PreloadingIteratorWrapper\ --pyi_out=..\..\python\Modules\PreloadingIteratorWrapper\ .\filelist.proto

It will generate for you _bp2.py as well as corresponding _bp2.pyi (stub file). After that your IDE will see the class names, intellisense will work etc.

Krzysiek
  • 599
  • 4
  • 14
  • 1
    Thanks, this helped me. A small correction: `protoc --proto_path=. -I . --python_out=..\..\python\Modules\PreloadingIteratorWrapper\ --pyi_out=..\..\python\Modules\PreloadingIteratorWrapper\ .\filelist.proto` – Poonam Anthony Mar 02 '23 at 06:36
1

Documentation says:

Unlike when you generate Java and C++ protocol buffer code, the Python protocol buffer compiler doesn't generate your data access code for you directly.

It means, that your .proto files won't be converted into some familiar accessors (no classes, no methods and no properties defined)

Then, in docs, you see this:

Instead (as you'll see if you look at addressbook_pb2.py) it generates special descriptors for all your messages, enums, and fields, and some mysteriously empty classes, one for each message type

Which means:

  • You'll see descriptors, parsed from serialized .proto file
  • You'll see variables, named exactly the same as in .proto file

But these "variables" are nothing but GeneratedProtocolMessageType class, which is a metaclass, but for protocol messages

Then, you should look into docs for GeneratedProtocolMessageType from package google.protobuf.internal.python_message, where you should see this line:

Metaclass for protocol message classes created at runtime from Descriptors.

And this line means that you won't see any expected properties or methods while you code. Because these variables will become metaclasses for your protocol messages only at runtime! These metaclasses, at the time you look at the lines with their instantiation, are factories for your protocol messages

Moreover, this behavior is mentioned in the middle of the docs for that class:

The protocol compiler currently uses this metaclass to create protocol message classes at runtime.

It works only that way:

We add implementations for all methods described in the Message class. We also create properties to allow getting/setting all fields in the protocol message. Finally, we create slots to prevent users from accidentally "setting" nonexistent fields in the protocol message, which then wouldn't get serialized / deserialized properly.

This metaclass generates classes for your protocol messages, described in .proto files, using descriptors from generated files. Only at runtime (not at coding time)
And only at runtime you'll be able to use them as factories for your classes (those, which you expect to see in code) and, also, as a types for method parameters

So there is nothing wrong with Google Protobuf Documentation, it is not outdated

dimankiev
  • 76
  • 1
  • 3
  • But, if you're looking for the ways to actually generate the classes you might be looking for, you can look into something called [python-betterproto](https://github.com/danielgtaylor/python-betterproto), but be aware of [possible performance degradation](https://github.com/jdvor/pbwhy) – dimankiev Aug 25 '22 at 01:02
  • thanks. could you please provide a code that we could use in the way you answered? – Soroosh May 20 '23 at 07:49
  • well the documentation actually says `it generates special descriptors for all your messages, enums, and fields, and some mysteriously empty classes, one for each message type:`, so I, too, expected to find the classes in there. It is not outdated, but probably not 100 correct either. – mic Jun 11 '23 at 01:31