8

I have an issue with storing Protobuf data to disk. The application i have uses Protocol Buffer to transfer data over a socket (which works fine), but when i try to store the data to disk it fails. Actually, saving data reports no issues, but i cannot seem to load them again properly. Any tips would be gladly appreciated.

void writeToDisk(DataList & dList)
{
    // open streams
    int fd = open("serializedMessage.pb", O_WRONLY | O_CREAT);
    google::protobuf::io::ZeroCopyOutputStream* fileOutput = new google::protobuf::io::FileOutputStream(fd);
    google::protobuf::io::CodedOutputStream* codedOutput = new google::protobuf::io::CodedOutputStream(fileOutput);

    // save data
    codedOutput->WriteLittleEndian32(PROTOBUF_MESSAGE_ID_NUMBER); // store with message id
    codedOutput->WriteLittleEndian32(dList.ByteSize()); // the size of the data i will serialize
    dList.SerializeToCodedStream(codedOutput); // serialize the data

    // close streams
    delete codedOutput;
    delete fileOutput;

    close(fd);
}

I've verified the data inside this function, the dList contains the data i expect. The streams report that no errors occur, and that a reasonable amount of bytes were written to disk. (also the file is of reasonable size) But when i try to read back the data, it does not work. Moreover, what is really strange, is that if i append more data to this file, i can read the first messages (but not the one at the end).

void readDataFromFile()
{   
    // open streams
    int fd = open("serializedMessage.pb", O_RDONLY);
    google::protobuf::io::ZeroCopyInputStream* fileinput = new google::protobuf::io::FileInputStream(fd);
    google::protobuf::io::CodedInputStream* codedinput = new google::protobuf::io::CodedInputStream(fileinput);

    // read back
    uint32_t sizeToRead = 0, magicNumber = 0;
    string parsedStr = "";

    codedinput->ReadLittleEndian32(&magicNumber); // the message id-number i expect
    codedinput->ReadLittleEndian32(&sizeToRead); // the reported data size, also what i expect
    codedinput->ReadString(&parsedstr, sizeToRead)) // the size() of 'parsedstr' is much less than it should (sizeToRead)

    DataList dl = DataList();

    if (dl.ParseFromString(parsedstr)) // fails
    {
        // work with data if all okay
    }

    // close streams
    delete codedinput;
    delete fileinput;
    close(fd);
}

Obviously i have omitted some of the code here to simplify everything. As a side note i have also also tried to serialize the message to a string & save that string via CodedOutputStream. This does not work either. I have verified the contents of that string though, so i guess culprit must be the stream functions.

This is a windows environment, c++ with protocol buffers and Qt.

Thank you for your time!

almagest
  • 161
  • 1
  • 2
  • 8
  • Why the hell are you using `new` and explicitly calling destructors? That makes no sense whatsoever. – Konrad Rudolph Sep 25 '12 at 10:53
  • I've edited to fix this issue. I have no idea why that seemed like a good idea at the time. Good catch, but not enough to fix my issue. – almagest Sep 25 '12 at 11:32
  • You actually only fixed half the issue: using pointers and `new` here still makes no sense. But yes, that’s unlikely to be related to your problem. – Konrad Rudolph Sep 25 '12 at 11:41
  • 4
    He's probably using `new` and `delete` unnecessarily by looking at the `CodedOutputStream` example https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.io.coded_stream – g19fanatic Dec 06 '12 at 12:56

3 Answers3

6

I solved this issue by switching from file descriptors to fstream, and FileCopyStream to OstreamOutputStream.

Although i've seen examples using the former, it didn't work for me.

I found a nice code example in hidden in the google coded_stream header. link #1

Also, since i needed to serialize multiple messages to the same file using protocol buffers, this link was enlightening. link #2

For some reason, the output file is not 'complete' until i actually desctruct the stream objects.

almagest
  • 161
  • 1
  • 2
  • 8
3

The read failure was because the file was not opened for reading with O_BINARY - change file opening to this and it works:

int fd = open("serializedMessage.pb", O_RDONLY | O_BINARY);

The root cause is the same as here: "read() only reads a few bytes from file". You were very likely following an example in the protobuf documentation which opens the file in the same way, but it stops parsing on Windows when it hits a special character in the file.

Also, in more recent versions of the library, you can use protobuf::util::ParseDelimitedFromCodedStream to simplify reading size+payload pairs.

... the question may be ancient, but the issue still exists and this answer is almost certainly the fix to the original problem.

-1

try to use

codedinput->readRawBytes insead of ReadString

and

dl.ParseFromArray instead of ParseFromString

Not very familiar with protocol buffers but ReadString might only read a field of type strine.

CinCout
  • 9,486
  • 12
  • 49
  • 67
Marius
  • 833
  • 5
  • 11