1

I have a software framework compiled and running successfully on both mac and linux. I am now trying to port it to windows (using mingw). So far, I have the software compiling and running under windows but its inevitably buggy. In particular, I have an issue with reading data that was serialized in macos (or linux) into the windows version of the program (segfaults).

The serialization process serializes values of primitive variables (longs, ints, doubles etc.) to disk.

This is the code I am using:

#include <iostream>
#include <fstream>

template <class T>
void serializeVariable(T var, std::ofstream &outFile)
{
    outFile.write (reinterpret_cast < char *>(&var),sizeof (var));
}

template <class T>
void readSerializedVariable(T &var, std::ifstream &inFile)
{
inFile.read (reinterpret_cast < char *>(&var),sizeof (var));
}

So to save the state of a bunch of variables, I call serializeVariable for each variable in turn. Then to read the data back in, calls are made to readSerializedVariable in the same order in which they were saved. For example to save:

::serializeVariable<float>(spreadx,outFile);
::serializeVariable<int>(objectDensity,outFile);
::serializeVariable<int>(popSize,outFile);

And to read:

::readSerializedVariable<float>(spreadx,inFile);
::readSerializedVariable<int>(objectDensity,inFile);
::readSerializedVariable<int>(popSize,inFile);

But in windows, this reading of serialized data is failing. I am guessing that windows serializes data a little differently. I wonder if there is a way in which I could modify the above code so that data saved on any platform can be read on any other platform...any ideas?

Cheers,

Ben.

Ben J
  • 1,367
  • 2
  • 15
  • 33
  • 1
    What are T types? I can think about different sizeof(var) values for different compilers - as result of structure element alignment, for example. Think about another serialization mechanizm, like XML. – Alex F Nov 16 '11 at 15:55
  • The types are primitive variables (see edited post for example)..I had no idea the compiler could produce a different value for sizeof(long), for example. Are you sure? I mean, a long I thought was always 4 bytes, a char is always 1 byte, independent of the compiler etc. – Ben J Nov 16 '11 at 16:01
  • 1
    Yes absolutely, we deal with several platforms and on some a long is 4 bytes and on some it is 8 bytes. I believe the standard only states a minimum and not an exact number of bytes for the data types. – Dan Nov 16 '11 at 16:41
  • 1
    That being the case, bugger! Thanks! Will look into XML as Alex suggested.. Just confirmed that this is the problem: on my mac, a long is 8 bytes, on my windows platform, its 4. – Ben J Nov 16 '11 at 16:48
  • 1
    You might also want to look into boost's serialization library or json (libjanson I think) as I've found them to be lighter weight. – Dan Nov 16 '11 at 16:53
  • Thanks, will do. Additionally, I've just ran tests for all other primitive variable types and they're all of the same size on both my mac and windows platforms. Additionally, I've realised I never serialize long primitives anyhow so there should never be any differences in byte size -- seems that I'm kind of back to square one.. I'm still not 100% sure what the issue is... I guess an alternative library is still an option though.. – Ben J Nov 16 '11 at 17:03

4 Answers4

2

this is just a wild guess sry I can't help you more. My idea is that the byte order is different: big endian vs little endian. So anything larger than one byte will be messed up when loaded on a machine that has the order reversed.

For example I found this peace of code in msdn:

int isLittleEndian() {
    long int testInt = 0x12345678;
    char *pMem;

    pMem = (char *) testInt;
    if (pMem[0] == 0x78)
        return(1);
    else
        return(0);
}

I guess you will have different results on linux vs windows. Best case would be if there is a flag option for your compiler(s) to use one format or the other. Just set it to be the same on all machines.

Hope this helps, Alex

Fritz
  • 186
  • 2
  • Thanks Alex, am going to run some tests to establish the byte ordering of my different platforms.. – Ben J Nov 16 '11 at 16:07
  • Well, it seems that the byte ordering is exactly the same on both my mac machine and my windows platform so I doubt the endianness has anything to do with it. Thanks though.. – Ben J Nov 16 '11 at 16:17
  • Dan might also be right that thats the Problem. Especially int doesnt have to be anywhere near 4 bytes where it usually is. Unfortunatly I don't have any first hand experience. Btw. what exactly do you mean by reading fails? For any data type? wrong value or error? – Fritz Nov 16 '11 at 16:49
  • Thanks @Fritz -- I'm still debugging the code, wondering what the heck is going on...will update when I've figured it out.. – Ben J Nov 16 '11 at 20:50
2

Binary serialization like this should work fine across those platforms. You do have to honor endianness, but that is trivial. I don't think these three platforms have any conflicts in this respect.

You really can't use as loose of type specifications when you do, though. int, float, size_t sizes can all change across platforms.

For integer types, use the strict sized types found in the cstdint header. uint32_t, int32_t, etc. Windows doesn't have the header available iirc, but you can use boost/cstdint.hpp instead.

Floating point should work as most compilers follow the same IEEE specs.

C - Serialization of the floating point numbers (floats, doubles)

Binary serialization really needs thorough unit testing. I would strongly recommend investing the time.

Community
  • 1
  • 1
Tom Kerr
  • 10,444
  • 2
  • 30
  • 46
  • Thanks Tom. I'm really trying to figure out whats going on with my implementation. Will update when its fixed. :-) – Ben J Nov 16 '11 at 20:51
  • Stupidly, and as suggested by user103749, I'd for some reason forgotten to incorporate std::ios::binary when opening the file stream. Since not bothering with this always worked under linux and mac, the bug just slipped through. Glad I stuck with this method of serialization though. Cheers. – Ben J Nov 18 '11 at 14:45
1

Just one more wild guess: you forget open file in binary reading mode, and on windows file streams convert sequence 13,10 to 10.

fghj
  • 8,898
  • 4
  • 28
  • 56
  • My apologies, but I have absolutely no idea what you mean. Could you clarify a little? – Ben J Nov 16 '11 at 20:52
  • 1
    I mean do you use something like std::ios::binary when open file stream on windows? – fghj Nov 17 '11 at 04:12
  • Thanks for your suggestion. I realised that in a few places, I had indeed forgotten to employ std::ios::binary. I feel a bit silly about this, but now it all seems fine and the serialization is working as expected :-) – Ben J Nov 18 '11 at 14:42
0

Did you consider using serialization libraries or formats, like e.g.:

  • XDR (supported by libc) or ASN1
  • s11n (a C++ serialization library)
  • Json, a very simple textual format with many libraries for it, e.g. JsonCpp, Jansson, Jaula, ....)
  • YAML, a more powerful textual format, with many libraries
  • or even XML, which is often used for serialization purposes...

(And for serialization of scalars, the htonl and companion routines should help)

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547
  • Thanks for the suggestions -- I'd not heard of these ideas, well except for XML...Thanks for opening my eyes. I'm still gonna try the way Ive been doing it first though because in my eyes it should be working...will update when I've fixed :-) – Ben J Nov 16 '11 at 20:51