0

So I'm trying to write the value of a long as raw binary to a file with fwrite(), here's a test file:

#include<stdio.h>
int main() {
    FILE* stream;
    stream = fopen("write", "wb");
    long number = 0xfff;
    fwrite(&number, sizeof(long), 1, stream);
}

When opening the written file in a text editor, this is the output:

ff0f 0000 0000 0000 

while I was expecting something like 0000 0000 0000 0fff.

How do I write the desired result correctly as raw binary?

Mossmyr
  • 909
  • 2
  • 10
  • 26
  • Search for "marshalling"/"serialisation". There is enough to be found on here and the rest of the web about that. Use shifts/masking to serialise data, not just write the memory area. – too honest for this site Sep 13 '16 at 13:33
  • 5
    [endianness](https://en.wikipedia.org/wiki/Endianness)...dark magic – LPs Sep 13 '16 at 13:35
  • 1
    It probably has more to do with how hex dumps is displayed by whatever you are using to display them. – Ian Abbott Sep 13 '16 at 13:35
  • 1
    This is how little-endian works. Least-significant byte first. – tkausl Sep 13 '16 at 13:36
  • 1
    if you always want to write in a consistent way you'll need to convert byte order first using `htonl` which will always use the same **output** byte order, you'll need to convert back after loading with `ntohl` to get your machine's native byte order. – Mgetz Sep 13 '16 at 13:47
  • Welcome to the Big Endian camp! :) – Lundin Sep 13 '16 at 13:57

3 Answers3

3

Read about Endianness, which states how a bytes are arranged in a word (or double/quad word, et cetera) in a computer system.

I'm assuming you've coded and compiled this example on a X86 system, which is little-endian, so, the least significant bits COME FIRST. The opposite of that arrangement is called big-endian.

Now, it is clear that your objective in this exercise is to marshall (or pickle, depending on how your prefer your jargon) some bytes to be later retrieved, possibly by another program.

If you develop a program that uses fread() and and reads the data in the same way (using sizeof(long) so you don't read too much data) and in a machine with the same endianness, it will magically work, and the number you expect is gonna be back. But, if you compile and run the "read" tool in a machine with the opposite endianness, reading the same input file, your number will be garbled.

If your objective is to marshall data, you should be better off with a tool to help you marshall your bytes in a way that is endianness-agnostic, that is, a library that helps you get the data in the correct order. There are libraries out there that take care of that for you.

Cadu
  • 179
  • 5
2

There's no problem. You're seeing it as ff0f 0000 0000 0000 cause of the endian of the machine! Try using fread() instead!

Willy
  • 635
  • 8
  • 18
1

as other have pointed out in the comments, this is an endianness "issue". That it, it is only an issue if you are going to run your software on systems with an other endianness.

This very useful resource is the first result for the "c endian" google search, at least for me. I hope it helps you.


Edit: I will dig a bit more into the details below.

To determine what is the endianness of the machine you are currently running on, write a known value (for example, 0xAABBCCDD) into memory, then take a pointer to that value and cast it to a char* or other 1-byte data type. Read the same value again, byte by byte, and compare the ordering with what you wrote. If they are the same, you are on a big endian machine. If not... Then you could be on a little (more probable) or middle (less probable) endian machine. You can in either case generate a "swap map" for reordering the bytes; which is the reason why I chose four different byte values in the above example.

But, how to swap those values? An easy way, as indicated here, is to use bit shifts with masks. You can also use tables and swap their values around.

MayeulC
  • 1,628
  • 17
  • 24