0

I'm looking to serialize a class that contains std::string to file and easily load this data in Python:

class A {
  public:
    int a;
    char b;
    bool c;
    std::string s1;
    std::string s2;
}

I have a very busy thread that deals with many instances of A. It takes interesting ones and adds them to a class for a less busy thread to write later.

class Blob {
public:
   char data[1024]
   size_t length;
}

void createBlob(void *data, int length) {
  Blob saved_a;
  saved_a.length = length;
  memcpy(saved_a.data, a, length);
}

Then, the low priority thread asynchronously writes blobs to file: file.write(reinterpret_cast (&saved_a.length), sizeof(saved_a.length)); file.write(saved_a, saved_a.length);

These files are then read by Python and use struct library to load the data/handle endianness.

I don't have a great way to store the std::string (partially because I don't understand what guarantees there are on the life of a std::string). Would the logging thread be able to cast the saved_a.data to a type A and then read the strings? Or does the memcpy only save pointers to strings that may no longer be valid.

Copying the A structure isn't really possible because createBlob can take many different data structures (only requires a void * and a size). I'm willing to sacrifice platform independence and count out/test through the packing to make sure that the Python parser works, but really need to minimize the load put on the function that creates blobs and need to ensure that it can create blobs of many different data types.

If the std::string would remain valid when the low priority logger gets to them, he could recast the data and a do a full copy. Otherwise, is there a lightweight solution to serialize the structure before passing to createBlob function (comparable in performance to doing the memcpy)?

user2411693
  • 533
  • 4
  • 14
  • And std::string's are pointers? From the other questions, it looks like they behave as reference counted pointers with copy constructor implemented by default, in which case, I'd guess memcpy invalidates them. I'll strike that line of thought from the question. – user2411693 Aug 19 '15 at 23:31
  • `memcpy`ing anything that isn't [trivially copyable](http://en.cppreference.com/w/cpp/concept/TriviallyCopyable) is unspecified at best, and will probably result in [undefined behavior](http://stackoverflow.com/questions/29777492/why-would-the-behavior-of-stdmemcpy-be-undefined-for-objects-that-are-not-triv) at some point. `std::string`s are not trivially copyable, so no class that has them as members will be either. – user657267 Aug 20 '15 at 00:14

2 Answers2

0

Memcpy never works for pointers as it copies pointers, not their values. So it won't help with any object or array stored in your structure. There is no simple way to do that automatically. But for strings you can write their bytes into memory directly using zero as string ending flag. Something like this:

class A {
  public:
    int a;
    char b;
    bool c;
    std::string s1;
    std::string s2;
    int length()
    {
       sizeof(a) + sizeof(b) + sizeof(c) + s1.length*sizeof(char) + 1 + s2.length*sizeof(char) + 1;
    }
    void* toByteArray()
    {
        char * res = new char[length()];
        int pos =0 ;
        pos+=writebytes(res, pos, tobytes(a));
        pos+=writebytes(res, pos, tobytes(b));
        pos+=writebytes(res, pos, tobytes(c));
        pos+=writebytes(res, pos, tobytes(s1));//string version should append zero char after string
        pos+=writebytes(res, pos, tobytes(s2));
    }
}

Also NEVER EVER copy classes using memcpy because it also copies virtual table pointer, not just variables inside class.

maxpovver
  • 1,580
  • 14
  • 25
0

No, of course not. You can not shoehorn strings into blobs using memcpy(). What is worst here, that it might actually work on some data due to small string optimizations available in some implementations. And than it will magically break on another set of data. If you want your stuff to be binary-serializable (I personally find binary serialization quite outdated) replace strings in your class with some sort of CharArray implementations, which uses arrays as storage. I personally prefer proper serialization.

SergeyA
  • 61,605
  • 5
  • 78
  • 137