I'm working on the integration of my library with some deep learning framework and I encountered some memory issues. I suspect that protobuf is the problem here but I wanted to ask you guys for opinion and some help because I had spent too much time on it already. In short the framework operates on deep learning models in ONNX format. It reads them into memory to onnx::ModelProto
objects. Those objects are then passed to my library where they get transformed (and optimized) to my custom representation and returned back to the framework. onnx::ModelProto
is a C++ class generated with protoc
from https://github.com/onnx/onnx/blob/master/onnx/onnx.proto - a regular protobuf message.
The problem occurs when the ModelProto
reaches my library. The main member of the ModelProto
is the graph, which is a pointer: onnx::GraphProto* onnx::ModelProto::graph_
. When the object is passed to my library, the graph pointer is set to some different address which is not a proper GraphProto
object location:
framework:
model_proto: 0x2ccb450
graph address: 0x2cc1d20
---
mylib:
model_proto: 0x2ccb450
graph address: 0x7fb6529c2560
The annoying thing is that it only happens in Release builds. When I compile both in debug - it works correctly.
Also, before this error popped up, I was passing the ModelProto
object to my library using the std::stringstream
- I first serialized the model in the framework to string, created a stream out of it and deserialized in my library. The graph was getting corrupted too just after the deserialization finished and it was so bad that I was getting segfaults further down in my code.
Could this have anything to do with the fact that both the framework and my library link statically with their own copies of protobuf? Protobuf is added as a dependency and compiled with both the framework and my library. I made sure that I use the same version (it's 3.11 at the moment). I also use the same ONNX version (1.6).
Here's how the dependencies and the workflow look: