1

I'm having a simple file save/load functionality, but as it's a plugin, due to host API everything is being written into std::ostream in binary format, and read back again from std::istream.

i use

out.write((char *)&value,sizeof(type));
in.read((char *)&value,sizeof(type));

for reading and writing, where type is "unsigned int", "double", etc.

I was thinking about possible consequences of this, what happens when file is saved on one platform, and loaded on another (due to host limitations, this will be a 32/64 bit windows, 64bit linux and 64bit mac, only x86 cpus). if I do not use variable-size type, like size_t (which is different on 32bit and 64bit systems), can I be certain that "unsigned int" or "double" will stay same length? Is there any best-practices to handle this?

uiron
  • 890
  • 11
  • 14
  • There's an interesting discussion on [encodings and portability](http://stackoverflow.com/questions/6300804/wchars-encodings-standards-and-portability) that might interest you. It's about strings, not numbers, but still very interesting (and related). – André Caron Dec 07 '11 at 05:45

2 Answers2

1

if I do not use variable-size type, like size_t (which is different on 32bit and 64bit systems), can I be certain that "unsigned int" or "double" will stay same length?

No. Even the size of unsigned int and double could vary across platforms.


Is there any best-practices to handle this?

Yes. Serialize the data.

For example, you could follow these steps:

  • Write the size of the variable first, as one single byte!
  • Then take the variable, split it into N number of bytes, where N = sizeof(value), then write each byte, one by one - either from low signigicant byte to high significant byte, or vice versa.
  • On another machine, just read the size first, and then read bytes one by one, merge them to get the value. Mergining means doing the reverse of the process mentioned in step 2.

If you're writing lots of values, then you may want to improve the above steps: the first and foremost you would not want to write the size for each value, for it is simply a repetition, instead you can write a header sort of things which contains all these information which is going to be used repeatedly.

Nawaz
  • 353,942
  • 115
  • 666
  • 851
  • Moreover, and strictly speaking, size of the data type is not the only issue. There is endian-ness, actual integer and floating-point representation (not all integers are two's completement, not all `double` IEE754), etc. – André Caron Dec 07 '11 at 05:31
  • well, the "read bytes one by one, then merge them into one value" might be a real pain, when my data is large chunks of double[] arrays, not to mention the hell of recreating double from it's binary representation. – uiron Dec 07 '11 at 05:50
  • @uiron: That is the sort of things programmers do in serialization. And that isn't pain; it is real fun playing with bytes; it makes you feel cool that you know very low-level details of how the software is working. As for doing it elegantly, you can write small utilities, to split and merge the bytes. – Nawaz Dec 07 '11 at 05:52
0

All the system you said this will run on will have the sized types the same like uint32_t. double and float will also be the same.
The best practice still remains what Nawaz's answer.

Daniel
  • 30,896
  • 18
  • 85
  • 139
  • sadly enough, uint32_t is not available for me as plugin on windows has to be compiled on VC 2008, but that got me on trail to find boost_integer, thanks. – uiron Dec 07 '11 at 06:01