3

I've been given a C++ application (a built executable and source code that doesn't build right now) that uses generated proto classes to send protobuf messages. I took the same .proto files it used to generate its classes, and I generated associated classes in a C# app. The intent is to be able to receive and send messages between these apps, using protobuf-net on the C# side. Note that both are using the proto2 format.

Messages with only simple type (e.g. int) members can be serialized and deserialized successfully. However, there seems to be an issue deserializing messages with nested message types into my C# application, e.g.

message Outer {
    optional Inner = 1;
}

message Inner {
    optional float f = 1;
}

A received message of type "Outer" will fail to deserialize in C# via:

Serializer.Deserialize<T>(new MemoryStream(msg)); // msg is a byte[]

giving an "Invalid Wire Type Exception." I followed the link here, but having looked at those answers, I didn't find anything immediately obvious relating to my situation. I'm 95% sure the source and destination generated classes are the same, the data isn't corrupt, and I'm deserializing to the correct type.

Can I correctly deserialize such nested types? Is there a compatibility issue with the way the classes were generated (and how it serializes) in the C++ app vs the C# app using protobuf-net?

Here is an example project (made in VS 2019 for .NET Core 3.1) which will reproduce the issue.

Touchdown
  • 494
  • 4
  • 19
  • 1
    I'd generate byte arrays from both languages of the "same" instance and compare them. Given that the protocol is pretty straightforward it should then be possible to see where the difference lies. – Voo May 19 '20 at 09:07
  • @Voo Having compared a simple type message, the bytes are identical. I only have a built version of the C++ app and source code which doesn't build right now due to missing dependencies. I have grabbed the packets with wireshark and am converting that to a byte[] for comparison. However, because I don't know what values the incoming data for nested types should have, I don't know whether it'll be a useful comparison. I will dig around and see whether I can find out what the data is. – Touchdown May 19 '20 at 09:19
  • FWIW, this should just work, IMHO. It seems to be at the core of what protobuf should be able to do. Can you post a complete, self-contained test case? – 500 - Internal Server Error May 19 '20 at 12:12
  • @500-InternalServerError I'm not sure what you mean; do you want to be able to take this and run it yourself? Or do you want more detail on what's currently in the question? – Touchdown May 19 '20 at 12:44
  • Yes, it would be good if we could all try exactly what you're trying. – 500 - Internal Server Error May 19 '20 at 12:46
  • @500-InternalServerError Ok, it will be a bit of a verbose case since I'm copy-pasting hex wireshark output and converting it to a byte[]. Maybe I'll make a separate project and post a link to it instead of making the question massive with all the source code. – Touchdown May 19 '20 at 13:03
  • @500-InternalServerError Added link at the bottom of the question. – Touchdown May 19 '20 at 13:48
  • @Voo, 500-InternalServerError I've been playing around with it and I noticed that my serialized message is about 50 bytes shorter than the one I'm receiving. I get the feeling there's some extra "stuff" in the message that shouldn't(?) be there. I will try with some of the other failing types and see what they produce. Update: I just chopped off the first 50 bytes from the incoming message, and it now deserializes to something that looks sensible. I guess this extra "stuff" is the problem; I'm not sure where it came from because I know the message format - it has a fixed-length header. – Touchdown May 19 '20 at 14:40

1 Answers1

1

My incoming data had about 50 extra bytes of "stuff" I wasn't expecting (i.e. not the message header), which didn't adhere to the defined message format, so the data was essentially corrupt. It was hard to tell this from looking at a stream of bytes; what gave it away was the difference in length of a message that I serialized on the C# side compared to the bytes I read from wireshark of a message coming in. I then looked at those bytes and found equivalent bytes to my serialized message a little ways in.

Why there is extra "stuff" in the incoming message is another matter, but this is an implementation detail, so I'll consider this question closed.

If it helps anyone in a similar situation, I made a little test loop to keep trying to deserialize byte-by-byte until it works (could be improved but works for now):

var rawBytes = msg.GetBytes(); // The raw incoming message
bool success = false;
OuterMsgType outer;
while (!success)
{
    try
    {
        rawBytes = rawBytes.Skip(1).ToArray();
        outer = ProtoBuf.Serializer.Deserialize<OuterMsgType>(new MemoryStream(rawBytes));
        if (outer?.InnerMsg != null)
             success = true;
    }
    catch (Exception e)
    {
        // Wire type exception: Ignore it, don't care
    }
}
Touchdown
  • 494
  • 4
  • 19