0

I have simple C and Java code that reads an integer from a non-text file. When I print out the result in hexadecimal, it looks like the byte order for the Java code is different than the C code. Example:

int fd = open("MyFile", O_RDONLY);
int rc = read(fd, &MagicNumber, 4);
printf("MagicNumber is 0x%x\n", MagicNumber);

This prints: MagicNumber is 0xff017ffe

The Java code is:

FileInputStream fin = new FileInputStream("MyFile");
DataInputStream din = new DataInputStream(fin);
int MagicNumber = din.readInt();
System.out.printf("MagicNumber is 0x%x\n", MagicNumber);

It prints: MagicNumber is 0xfe7f01ff

The output is similar, but the bytes are swapped. Am I doing file I/O incorrectly? Do I need to do something to read bytes consistently between C and Java?

JB_User
  • 3,117
  • 7
  • 31
  • 51

2 Answers2

3

The byte order in a DataInputStream is "big endian" (aka "network order"). This is used by almost every binary network protocol and many file formats, even though little-endian architectures are now dominant.

In your C code you can use ntohl to portably convert from "network order" back into "host order" without needing to know whether the host order is little-endian or big-endian.

#include <arpa/inet.h>

...

printf("MagicNumber is 0x%x\n", ntohl(MagicNumber));
Alnitak
  • 334,560
  • 70
  • 407
  • 495
  • 1
    Network order? I just want to read an integer from a file and not have to spend an hour digging through documentation. – JB_User May 09 '20 at 20:01
  • big endian, aka network order, is how 99%+ of all documentation is written, and what 99%+ of all network protocols use. Java guaranteed that all operations are in a defined order (and that is network order (big endian), unless it explicitly says otherwise). In C, on the other hand, you get whatever order your architecture prefers. If you're using x86/x64, that'd be little endian. It's C that's the wonky one, not java. Open your file in a hex editor. You will find that the first byte in it, is 0xfe. The one java reports. – rzwitserloot May 09 '20 at 20:02
  • 1
    So this really is going to take an hour to figure out? Reading an integer from a file in Java. – JB_User May 09 '20 at 20:09
  • Nope, you just need the value of `ntohl(MagicNumber)` – Alnitak May 09 '20 at 20:10
  • 2
    p.s. please cut out the attitude! – Alnitak May 09 '20 at 20:15
  • Okay, sorry. I'm thankful for the help (but a bit frustrated). Where do I find ntohl in Java? – JB_User May 09 '20 at 20:19
  • you don't - stick to big-endian format as your default, and only use `ntohl` in your C code. Writing binary files _without_ normalising the endianess makes them non-portable. – Alnitak May 13 '20 at 19:34
0

DataInputStream read multi-byte values using big-endian byte order, placing the most significant byte first and the least significant byte last.

As Wikipedia says:

Big-endianness is the dominant ordering in networking protocols (IP, TCP, UDP). Conversely, little-endianness is the dominant ordering for processor architectures (x86, most ARM implementations, base RISC-V implementations) and their associated memory. File formats can use either ordering; some formats use a mixture of both.

Since DataInputStream is intended for data exchange, it uses the "network order", aka big-endian.

To control the byte order, you should use ByteBuffer. Here is a simple example.

FileInputStream fin = new FileInputStream("MyFile");

byte[] buffer = fin.readNBytes(Integer.BYTES);
int magicNumber = ByteBuffer.wrap(buffer).order(ByteOrder.LITTLE_ENDIAN).getInt();
Andreas
  • 154,647
  • 11
  • 152
  • 247