2

Say I have a binary file; it contains positive binary numbers, but written in big endian as 32-bit integers

How do I read this file? I have this right now.

int main() {
    FILE * fp;
    char buffer[4];
    int num = 0;
    fp=fopen("file.bin","rb");
    while ( fread(&buffer, 1, 4,fp) != 0) {

        // I think buffer should be 32 bit integer I read,
        // how can I let num equal to 32 bit big endian integer?
    }
    return 0;
}
William Seemann
  • 3,440
  • 10
  • 44
  • 78
  • 3
    Just curious - why is your code identical to this one: http://stackoverflow.com/questions/13001183/how-to-read-little-endian-integers-from-file-in-c ? ;-) – Artur Nov 03 '13 at 09:30
  • 1
    Avoid `fopen` and use proper RAII-aware classes such as `std::fstream`. Your code is potentially exception-unsafe, and you don’t call `fclose` so you’re leaking resources. –  Nov 03 '13 at 10:49

3 Answers3

8

Declare your buffer as:

unsigned char buffer[4];

and you may use this to convert endianess:

int num = (int)buffer[0] | (int)buffer[1]<<8 | (int)buffer[2]<<16 | (int)buffer[3]<<24;

BTW

Of course this applies to x86 architectures that are little endian - otherwise your platform endianess may match your file's endianess so no conversion needed. This way you could read directly into your int without convesions.

Artur
  • 7,038
  • 2
  • 25
  • 39
  • 6
    This answer is for little endian. For big endian, use this. `int num = (int)buffer[3] | (int)buffer[2]<<8 | (int)buffer[1]<<16 | (int)buffer[0]<<24;` – takasoft Jan 31 '18 at 02:07
6

You need to find out your endianess first:

How can I find Endian-ness of my PC programmatically using C?

Then you need to act accordingly. If you're the same as the file, you can read the value as is and if you are in a different endianess you need to reorder the bytes:

union Num 
{
    char buffer[4];
    int num;
} num ;

void swapChars(char* pChar1, char* pChar2)
{
    char temp = *pChar1;
    *pChar1 = *pChar2;
    *pChar2 = temp;
}

int swapOrder(Num num)
{
    swapChar( &num.buffer[0], &num.buffer[3]);
    swapChar( &num.buffer[1], &num.buffer[2]);

    return num.num; 
}

while ( fread(&num.buffer, 1, 4,fp) != 0)
{
    int convertedNum;
    if (1 == amIBigEndian) 
    {
        convertedNum = num.num
    } 
    else
    {
        convertedNum = swapOrder(num);
    }
    // Do what ever you want with convertedNum here...
}
Community
  • 1
  • 1
selalerer
  • 3,766
  • 2
  • 23
  • 33
  • 2
    That's so slow. Swap in one expression, do not check endianess in every loop invocation. In most of cases endianess is known even before the program is started. – Artur Nov 03 '13 at 09:35
  • 3
    @Artur I have to agree! Why not simply [`ntohl()`](http://man7.org/linux/man-pages/man3/htonl.3.html) if we **know** it's a big endian format. This is often implemented as a small `asm {}` snippet optimized for the MCU. – πάντα ῥεῖ Nov 03 '13 at 10:06
  • For actual solution I'm writing I'll probably use something the OS provides or the libraries I'm using (e.g. ACE_SWAP()). For a generic question this is a generic answer. Regarding performance, I wouldn't write off a solution without bench marking it. – selalerer Nov 03 '13 at 10:18
3

It is operating system and processor architecture specific.

You might perhaps use routines like htonl(3) or ntohl etc...

but you really should have serialized in a well defined format.

On current machines (where I/O is very slow, w.r.t. CPU speed) I am in favor of using textual serialization formats like JSON, YAML, .... But you could also use binary serialization (and libraries) like BSON, XDR, ASN.1 or the s11n library....

If possible, improve the producer code (the one writing your file.bin file), and the consumer code accordingly.

Binary data is inherently brittle, because it is system and architecture specific. At the very least, document extremely well its format, and preferably give some tools to convert it from and to textual formats.

There are several JSON libraries for C++, like jsoncpp and rapidjson and for C like jansson etc etc...

Basile Starynkevitch
  • 223,805
  • 18
  • 296
  • 547