1

TL;DR

I can send a google protobuf message via a Boost socket, and I can receive the message, but not parse the message (so that I can meaningfully use it).

Code Files

I am currently working on the sending of client -> server. I will be adding server -> client once the issue highlighted here is resolved.

Basically, I have tried 2 different approaches, and neither are working to parse the message properly. I know I can get the data because the reported buffer size is comming out properly (the top version sometimes is off, as is with this example).

Here is the code for the client that sends to the server:

std::cout << "Socket connected." << std::endl;
// Here we create a google protobuf message.
demomessage sendmessage;
// Setting the fields: header, counted, atime
sendmessage.set_header("Dope header"); // 12 bytes        == 20 bytes total
sendmessage.set_atime(0.1234); // 4 bytes    
sendmessage.set_counted(234); // 4 bytes
// what is the size
std::cout<< "size after serilizing is " << sendmessage.ByteSize() << std::endl;
// Now the buffer needs to be sent to the socket
sendMessages(session->m_sock, sendmessage);// the m_sock is just the connected socket - this works without a hitch.
std::cout << "Closing socket." << std::endl;

Note that demomessage is the object generated by the protoc compiler. Here is the definition for the sendMessages method call:

void sendMessages( boost::asio::ip::tcp::socket &thesock , demomessage &themessage)
{
    // This is all that I should need.
    boost::asio::streambuf b;
    std::ostream os(&b);

    google::protobuf::io::ZeroCopyOutputStream *raw_output = new google::protobuf::io::OstreamOutputStream(&os);
    google::protobuf::io::CodedOutputStream *coded_output = new google::protobuf::io::CodedOutputStream(raw_output);

    coded_output->WriteVarint32(themessage.ByteSize());
    themessage.SerializeToCodedStream(coded_output);

    delete coded_output;
    delete raw_output;

    // boost::asio::write(thesock, themessage);
    thesock.send(b.data());
}

The client sending code appears to work without a hitch, and I have been piecing together code from other stack overflow posts to create that sending code (giving credit where due).

Here is my server code. I have 2 different versions. I have split the versions through a #ifdef WHICHVER call so that the layout is easier. They are 2 independent programs that are attempting to complete the same task, yet neither of them work. I have been trying 2 different versions of the code because the internet showed 2 different ways of parsing the socket data into a boost::asio::streambuf object. Here is the full method call, as in my .cpp file:

void readBody(boost::asio::ip::tcp::socket &csock)
{
    // Create the object that stores the created message
    demomessage payload;
    // This should be the same for both methods
    const int hdrsize = 4;

    #ifdef WHICHVER

    boost::system::error_code ec;
    boost::asio::streambuf a_stream;
    int bytecount = 0;
    while ((bytecount = boost::asio::read(csock, a_stream, ec)) > 0)
    {
        std::cout << "received: " << &a_stream << std::endl;
        if (ec) {
            std::cout << "bytecount: " << std::__cxx11::to_string(bytecount) << std::endl;
            std::cout << "status: " << ec.message() << "\n";
            break;
        }
    }

    google::protobuf::uint32 siz = bytecount;
    char buffer [siz+hdrsize];
    buffer[0] = '\0';
    std::string str(buffer);
    std::cout << "Did I work: " << (payload.ParseFromString(str) ? "I worked" : "I did not work") << std::endl;

    #else

    // The dumb buffer for reading in the data size
    char d1[hdrsize];
    // boost::asio::read(csock, boost::asio::buffer(d1, 4), boost::asio::transfer_exactly(4));
    boost::asio::read(csock, boost::asio::buffer(d1, 1));
    google::protobuf::uint32 siz = readHdr(d1); // This is a helper method, included outside of this code block
    std::cout << "Message size: " << siz << std::endl;
    char buffer [siz+hdrsize];//size of the payload and hdr
    buffer[0] = '\0';
    for (int i = 0; i < siz; i++)
    {
        try {
            boost::asio::read(csock,  boost::asio::buffer(buffer, 1));
        }
        catch (boost::system::system_error &e)
        {
            std::cout << "Error occured! Error code = " << e.code() << ". Message: " << e.what() << std::endl;
            onFinish();
        }
        std::cout << "Iter " << std::__cxx11::to_string(i) << ":  " << std::__cxx11::to_string(buffer[0]) << std::endl;
    }

    //Assign ArrayInputStream with enough memory
    google::protobuf::io::ArrayInputStream ais(buffer,siz+4);
    google::protobuf::io::CodedInputStream coded_input(&ais);
    //Read an unsigned integer with Varint encoding, truncating to 32 bits.
    coded_input.ReadVarint32(&siz);
    std::cout << "Again Message size: " << siz << std::endl; // This size is DIFERENT!!
    // After the message's length is read, PushLimit() is used to prevent the CodedInputStream 
    // from reading beyond that length. Limits are used when parsing length-delimited embedded messages
    google::protobuf::io::CodedInputStream::Limit msgLimit = coded_input.PushLimit(siz);
    std::cout << "Did I work: " << (payload.ParseFromArray(buffer, siz+4) ? "I worked" : "I did not work") << std::endl;
    std::cout << "Did I work: " << (payload.ParseFromCodedStream(&coded_input) ? "I worked" : "I did not work") << std::endl;
    //Once the embedded message has been parsed, PopLimit() is called to undo the limit
    coded_input.PopLimit(msgLimit);

    #endif

    int thecounted = payload.counted();
    std::string counterstring = std::__cxx11::to_string(thecounted);
    float thetime = payload.atime();
    std::string stringtime = std::__cxx11::to_string(thetime);
    // payload.PrintDebugString();                                int32                    float
    std::cout << "received: " << payload.header() << " : " << counterstring << " : " << stringtime << std::endl;
}

Here is the readHdr helper function:

google::protobuf::uint32 readHdr(char *buf)
{
    google::protobuf::uint32 size;
    google::protobuf::io::ArrayInputStream ais(buf,4);
    google::protobuf::io::CodedInputStream coded_input(&ais);
    coded_input.ReadVarint32(&size);//Decode the HDR and get the size
    //cout<<"size of payload is "<<size<<endl;
    return size;
}

And the last thing: here is the google protobuf definition.

syntax = "proto3";

message demomessage {
    // Header data, in case it is necessary.
    string header = 1;
    // Here we store integer count value
    int32 counted = 2;
    // And here we store a float value to represent the time or whatever
    float atime = 3;
}

Code Output

When the #define WHICHVER call is added at the top, the top version script runs. When I run that top script, here is the output from the server.

The server output

received: 

Dope header�$��=
bytecount: 22
status: End of file
Did I work: I worked
received:  : 0 : 0.000000
received: 

Dope header�$��=
bytecount: 22
status: End of file
Did I work: I worked
received:  : 0 : 0.000000
received: 

Dope header�$��=
bytecount: 22
status: End of file
Did I work: I worked
received:  : 0 : 0.000000

I want to point out that the text after the first received is an eof character (I just found that out today). Also, the weird chars are actually that: data that cannot be properly rendered at the moment, given my code. But, since part of the message is an actual string, it does not get botched in transmission.

And the client output.

The client output

Socket connected.
size after serilizing is 21
Closing socket.
Request #1 has completed. Response: 
Socket connected.
size after serilizing is 21
Closing socket.
Request #2 has completed. Response: 
Socket connected.
size after serilizing is 21
Closing socket.
Request #3 has completed. Response: 
Program completed.

Now, when I run the bottom version of the code, I get this output on the server. (The client output does not change with the new server code).

Message size: 21
Iter 0:  10
Iter 1:  11
Iter 2:  68
Iter 3:  111
Iter 4:  112
Iter 5:  101
Iter 6:  32
Iter 7:  104
Iter 8:  101
Iter 9:  97
Iter 10:  100
Iter 11:  101
Iter 12:  114
Iter 13:  16
Iter 14:  -22
Iter 15:  1
Iter 16:  29
Iter 17:  36
Iter 18:  -71
Iter 19:  -4
Iter 20:  61
Again Message size: 61
Did I work: I did not work
Did I work: I did not work
received:  : 0 : 0.000000
Message size: 21
Iter 0:  10
Iter 1:  11
Iter 2:  68
Iter 3:  111
Iter 4:  112
Iter 5:  101
Iter 6:  32
Iter 7:  104
Iter 8:  101
Iter 9:  97
Iter 10:  100
Iter 11:  101
Iter 12:  114
Iter 13:  16
Iter 14:  -22
Iter 15:  1
Iter 16:  29
Iter 17:  36
Iter 18:  -71
Iter 19:  -4
Iter 20:  61
Again Message size: 61
Did I work: I did not work
Did I work: I did not work
received:  : 0 : 0.000000
Message size: 21
Iter 0:  10
Iter 1:  11
Iter 2:  68
Iter 3:  111
Iter 4:  112
Iter 5:  101
Iter 6:  32
Iter 7:  104
Iter 8:  101
Iter 9:  97
Iter 10:  100
Iter 11:  101
Iter 12:  114
Iter 13:  16
Iter 14:  -22
Iter 15:  1
Iter 16:  29
Iter 17:  36
Iter 18:  -71
Iter 19:  -4
Iter 20:  61
Again Message size: 61
Did I work: I did not work
Did I work: I did not work
received:  : 0 : 0.000000

What I have been trying to do is parse the demomessage into its 3 respective data fields,payload.counted(), payload.atime(), and payload.header(). The output I am always getting is received: : 0 : 0.000000 and the output I want is Dope header, 0.1234, and 234.

I also noticed some patterns, like the bottom code last char output is equivalent to the again message size. The byte count looks like the correct number of bytes given that each char in the string is a byte and that each float and int is 4 bytes each.

So, the question, in summary:

What do I have to change to make either of the 2 versions of the message parsing work properly?

Here are a bunch of links that I have either refered to or have given me inspiration and ideas:

https://stackoverflow.com/questions/15416270/read-only-desired-amount-of-bytes-using-boost-asio
https://stackoverflow.com/questions/31960010/boost-asio-streambuf
https://stackoverflow.com/questions/28929699/boostasio-read-n-bytes-from-socket-to-streambuf
https://stackoverflow.com/questions/14324060/boost-receive-data-from-the-tcp-socket
https://stackoverflow.com/questions/9496101/protocol-buffer-over-socket-in-c
https://stackoverflow.com/questions/31597861/c-linux-google-protobuf-boostasio-cannot-parse
https://stackoverflow.com/questions/26655733/protobuf-codedinputstream-parsing-partial-messages
https://stackoverflow.com/questions/19839849/googleprotobuf-boostasio-failure
https://stackoverflow.com/questions/37986439/handling-reset-by-peer-scenario-with-boostasio
https://stackoverflow.com/questions/56327248/sending-and-receiving-protobuf-data-over-socket-via-boost-asio
https://stackoverflow.com/questions/5679764/boostasiostreambuf-empty
https://stackoverflow.com/questions/3091152/looking-for-a-memorystream-in-c
https://stackoverflow.com/questions/4810026/sending-protobuf-messages-with-boostasio
https://stackoverflow.com/questions/27672591/boost-asio-send-and-receive-messages#27672995
https://stackoverflow.com/questions/8269452/google-protocol-buffers-parsedelimitedfrom-and-writedelimitedto-for-c
https://stackoverflow.com/questions/31960010/boost-asio-streambuf/31992879
https://stackoverflow.com/questions/37372993/boostasiostreambuf-how-to-reuse-buffer
https://stackoverflow.com/questions/28478278/working-with-boostasiostreambuf

https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.io.coded_stream#CodedInputStream.ReadVarint32.details
https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.message_lite#MessageLite.MergeFromCodedStream
http://charette.no-ip.com:81/programming/doxygen/boost/group__read.html
http://pages.cs.wisc.edu/~starr/bots/Undermind-src/html/classgoogle_1_1protobuf_1_1io_1_1OstreamOutputStream.html
https://www.bogotobogo.com/cplusplus/Boost/boost_AsynchIO_asio_tcpip_socket_server_client_timer_bind_handler_multithreading_synchronizing_network_D.php 

I am happy to respond to constructive comments that add to the explanation of the question.

This image shows the borked output from the server when running the top code version. Notice the special chars you dont normally see.

Here is the borked text image.

robotsfoundme
  • 418
  • 4
  • 18
  • You're sending the encoded size of the message as a varint before the message itself, but neither version of your code seems to read a message in that format. – Miles Budnek Nov 15 '19 at 02:59
  • The first thing to do here is get as far as knowing what bytes you received at the server, ideally in hex or base-64 (since protobuf is binary). Then you can use protoc, or a tool like https://protogen.marcgravell.com/decode to see if it was received validly. You mention an EOF character. But... what do you mean by that? There isn't an ASCII symbol of that. Do you mean it is nul-terminated? The most likely problem here is treating binary as text. Note that since text payloads are UTF-8 encoded inside protobuf, it is normal to be able to see them, even if the data has been corrupted. – Marc Gravell Nov 15 '19 at 07:50
  • @MilesBudnek Do you have an example for reading the size and then the message properly? I know I can find the message size properly in the bottom message parser (my function `readHdr()` seems to always give me the correct size, according to the client - I think I forgot to post that function on SO, will add it), but I am not getting the actual data properly out of the message. It looks like I get the correct amount of chars out (they are a byte in size), but, for some reason, it does return true when I look at the bool output of the parsing methods. – robotsfoundme Nov 15 '19 at 15:06
  • @MarcGravell How would you read in hex or base-64? I am not sure how to do that. To read the data: are you saying that protoc can read a saved-to-file streambuf of the data? I have not used protoc for reading buffers before. I see your online tool: How do I get the data in a format to use that tool? Could I just copypaste from the terminal? Or do I have to save it somehow? How do you recommend saving the buffer? Also, I only mention the eof thing because I thought it was relevant: I may be wrong about that. I figure binary data is the problem, but I dont know how to operate on that knowledge. – robotsfoundme Nov 15 '19 at 15:20
  • @thatrobotguy if the data is large, just save it as a file - a file is the easiest way to use `protoc` (IIRC you want the `--decode-raw` option, but check the command-line help), or you can upload the file to that site to verify it; if it is small, hex or base-64 is easiest; no, you can't use the terminal: the terminal is text, protobuf is binary (unless you pipe it direct to stdout). I don't know the easiest way to get hex or base-64 in C++ - I'm mostly a .NET person, where it would be `Convert.ToBase64String` for base-64 or `BitConverter.ToString` for hex – Marc Gravell Nov 15 '19 at 15:51

0 Answers0