6

I have a structure with the following format:

struct Serializable {
    uint64_t value1;
    uint32_t value2;
    uint16_t value3;
    uint8_t  value4;

    // returns the raw data after converting the values to big-endian format 
    // if the current architecture follows little-endian format. Else, if 
    // if the current architecture follows big-endian format, the return 
    // expression will be "return (char*) (this);" 
    char* convert_all_to_bigendian();

    // checks if the architecture follows little-endian format or big-endian format. 
    // If the little-endian format is followed, after the contents of rawdata 
    // are copied back to the structure, the integer fields are converted back to their,
    // little-endian format (serialized data follow big-endian format by default).
    char* get_and_restructure_serialized_data(char* rawdata);
    uint64_t size();


} __attribute__ ((__packed__)); 

The implementation of the size() member:

uint64_t Serializable::size() {
     return sizeof(uint64_t) + sizeof(uint32_t) +
            sizeof(uint16_t) + sizeof(uint8_t);
}

If I write an object of the above structure to the file using fstream, as given in the following code:

std::fstream fWrite ("dump.dat", std::ios_base::out | std::ios_base::binary);
 // obj is an object of the structure Serializable.
fWrite.write (obj.convert_all_to_bigendian(), obj.size()); 

Will the contents written to the file dump.dat be cross-platform?

Assuming that I write another class and structure comparable to work with Visual C++, then Will the windows side application interpret the dump.dat file the same way the Linux side does?

If not, can you please explain what other factors should I consider other than padding and the differences in Endianness (which is dependent on the processor architecture) to make this cross-platform?

I understand that there are too many serialization libraries out there, which are all well tested and used extensively. But I'm doing this purely for learning purpose.

Sreram
  • 491
  • 1
  • 9
  • 22
  • 1
    You need to show `convert_all_to_bigendian()` and I am guessing you have an opposite `Serializable convert_all_from_bigendian(const char*)`? And how cross platform do you need or only Windows/Linux on x86/x64? The fixed width integer types are not garunteed to exist, see http://en.cppreference.com/w/cpp/types/integer – Fire Lancer Aug 29 '17 at 10:53
  • @FireLancer My exact implementation is not structured like this. What I had given was just an example to explain the problem. Yes, an opposite `convert_all_from_bigendian(const char*)` will be needed! – Sreram Aug 29 '17 at 10:57
  • Apart from needing a function to convert from big endian to a native format, you also need to allow for the results of `sizeof()` being implementation defined (not guaranteed to produce the same results for anything other than `char` types, and arrays of them) and for the possibilities of machines that are neither big endian or little endian (there are also cases of middle-endian, mixed-endian). Lastly, you are working with byte endianness - there is also bit endianness. Such considerations are relevant if you ever go beyond the windows and linux worlds. – Peter Aug 29 '17 at 11:01
  • @Peter, can `sizeof(int8_t)*CHAR_BIT` etc. ever not be 8 where `int8_t` etc. is defined at all? But yes, sizes change for many types if your needing to be fully cross platform. – Fire Lancer Aug 29 '17 at 11:04
  • How cross-platform do you need to be? [Here is one example](https://stackoverflow.com/a/6972551/597607) where `uint32_t` doesn't exist (for several reasons - no 32-bit integer and not using 2's complement). – Bo Persson Aug 29 '17 at 11:04
  • So, if you dont care about platforms that dont have those integer sizes and if `convert_all_to_bigendian` and `convert_all_from_bigendian` are correct, then yes writing char/bytes to a file and reading them back will work. But thats some big ifs making it hard to really answer, and I also dont think you were really just asking if reading/writing `char` on common systems allows file transfers. – Fire Lancer Aug 29 '17 at 11:05
  • @Peter `(there are also cases of middle-endian, mixed-endian)` I never thought there were so many formats... So to sum it up, there are two problems to solve: endianness, and padding (with too many sub problems). Am I right? – Sreram Aug 29 '17 at 11:07
  • I know the exact set of binary values I would like to write to a file. But the problem is, I want to write them the exact way I see them (which makes it cross-platform). Is there a simple way to solve this problem without using libraries other than the standard ones? – Sreram Aug 29 '17 at 11:11
  • 1
    I dont think so really unless you specify specific platforms. e.g. Just supporting MSVC/Windows and GCC/Linux on just x86 and x64 is easy (espiecally if you just use little endian files). As you add in more possible platforms it gets harder. – Fire Lancer Aug 29 '17 at 11:16
  • @Fire Lance - there are implementations which have `CHAR_BIT` not equal to `8`. Unusual, yes. Non-existent, no. – Peter Aug 29 '17 at 11:18
  • @Peter, yes, but can they also actually have `int8_t` etc.? I was under the impression such implementations could not have such a type at all (and `int_fast8_t`, `int_least8_t` etc. would then be some type larger than 8 bits). so `sizeof(int8_t)*CHAR_BIT == 8` is either always true, or a compile error (`int8_t` undefined) – Fire Lancer Aug 29 '17 at 11:20
  • 1
    @Sreram - there are a number of sub-problems. They can be worked through - the number might be more than you expect, but is finite. You may also wish to impose constraints (e.g. on platform, etc) to reduce the problem space (albeit, that means your code will need to be modified if the code is ever ported to a platform which doesn't fit your constraints). – Peter Aug 29 '17 at 11:21

0 Answers0