I need to create LMDBs dynamically that can be read by Caffe's data layer, and the constraint is that only C is available for doing so. No Python.
Another person examined the byte-level contents of a Caffe-ready LMDB file here: Caffe: Understanding expected lmdb datastructure for blobs
This is a good illustrative example but obviously not comprehensive. Drilling down led me to the Datum message type, defined by caffe.proto, and the ensuing caffe.pb.h file created by protoc from caffe.proto, but this is where I hit a dead end.
The Datum class in the .h file defines a method that appears to be a promising lead:
void SerializeWithCachedSizes(::google::protobuf::io::CodedOutputStream* output) const
I'm guessing this is where the byte-level magic happens for encoding messages before they're sent.
Question: can anyone point me to documentation (or anything) that describes how the encoding works, so I can replicate an abridged version of it? In the illustrative example, the LMDB file contains MNIST data and metadata, and 0x08 seems to signify that the next value is "Number of Channels". And 0x10 and 0x18 designate heights and widths, respectively. 0x28 appears to designate an integer label being next. And so on, and so forth.
I'd like to gain a comprehensive understanding of all possible bytes and their meanings.