0

A rather simple questions about safe ways to write constant in size structs (containing uint8_t, uint32_t etc) to binary file that would ensure it's readibility

  1. Is it accepted to use #pragma pack? (similiar to BITMAPFILEHEADER ) and then writing entire struct to file (so far it worked with bitmaps). Should I rather use a simple serialization to single bytes ( as shown here Serialization of struct)?

  2. What about endianness? How should one prepare for switch to different one? Is forcing eg little endian only and requiring application (in BE) to byteswap each element accepted?

My current project is rather simple one, but I would like to expand it in the future so I would rather try to avoid any pitfalls. I know that boost offers serialization, but for know I would like to handle stuff manually.

Community
  • 1
  • 1
johnyyonehand
  • 52
  • 1
  • 6
  • One way is to do like Unicode and explicitly include endian information with the data. It's only 1 bit after all :) It makes it much easier on the decoding side as well as long as you understand that bit. – Michael Dorgan Jan 04 '17 at 01:01
  • Before I write out a long answer, let me ask this. Why wouldn't you want to just write our your data in XML or JSON with an appropriate library? Binary file formats are often harder to debug and have all the issues with endianness and packing you refer to. Text file formats, like XML and JSON, have none of these issues. – selbie Jan 04 '17 at 01:08
  • Hi selbie, currently we are doing small project for uni that requires that we store a condensed data as binary file(hence the questions). The purpose of the project it basically boils down to stroing data in binary file and then transforming it into a different files that could represent different files (like PNG,BMP, etc) – johnyyonehand Jan 04 '17 at 01:23
  • [TIFF files](https://en.wikipedia.org/wiki/TIFF#Byte_order) began with `"MM"` or `"II"` to indicate endian-ness. A file was typically written in the code's native endian while reading a file was obliged to read according to the indicated endian. There are many solutions to endian. – chux - Reinstate Monica Jan 04 '17 at 04:47
  • Some food for thought: [Type traits in C](https://github.com/alexfru/TypeTraitsInC). – Alexey Frunze Jan 04 '17 at 05:33

1 Answers1

2

Is it accepted to use #pragma pack?

That's something frowned upon. Packing a struct is going to violate data alignment requirements. On some architectures unaligned access is merely slower, on others it's outright forbidden. In the latter case the compiler is forced to generate code that will reassemble your data members from bytes every time you access them.

Instead of struct packing, you should be writing custom serialization code. That is, design your classes like you normally would, with encapsulation and stuff, and then just provide serialization / deserialization methods.

What about endianness? How should one prepare for switch to different one? Is forcing eg little endian only and requiring application (in BE) to byteswap each element accepted?

That's something perfectly accepted and in fact widely used. The alternative of having endiannes being encoded in data format itself is a bad idea IMHO, as it complicates your code for no good reason. When doing I/O, byte swapping is not going to be the performance bottleneck anyway.

Joseph Artsimovich
  • 1,499
  • 10
  • 13