13

I am writing an ELF analyzer, but I'm having some trouble converting endianness properly. I have functions to determine the endianness of the analyzer and the endiannness of the object file.

Basically, there are four possible scenarios:

  1. A big endian compiled analyzer run on a big endian object file
    • nothing needs converted
  2. A big endian compiled analyzer run on a little endian object file
    • the byte order needs swapped, but ntohs/l() and htons/l() are both null macros on a big endian machine, so they won't swap the byte order. This is the problem
  3. A little endian compiled analyzer run on a big endian object file
    • the byte order needs swapped, so use htons() to swap the byte order
  4. A little endian compiled analyzer run on a little endian object file.
    • nothing needs converted

Is there a function I can use to explicitly swap byte order/change endianness, since ntohs/l() and htons/l() take the host's endianness into account and sometimes don't convert? Or do I need to find/write my own swap byte order function?

xdumaine
  • 10,096
  • 6
  • 62
  • 103

5 Answers5

14

I think it's worth raising The Byte Order Fallacy article here, by Rob Pyke (one of Go's author).

If you do things right -- ie you do not assume anything about your platforms byte order -- then it will just work. All you need to care about is whether ELF format files are in Little Endian or Big Endian mode.

From the article:

Let's say your data stream has a little-endian-encoded 32-bit integer. Here's how to extract it (assuming unsigned bytes):

i = (data[0]<<0) | (data[1]<<8) | (data[2]<<16) | (data[3]<<24);

If it's big-endian, here's how to extract it:

i = (data[3]<<0) | (data[2]<<8) | (data[1]<<16) | (data[0]<<24);

And just let the compiler worry about optimizing the heck out of it.

Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
8

In Linux there are several conversion functions in endian.h, which allow to convert between arbitrary endianness:

uint16_t htobe16(uint16_t host_16bits);
uint16_t htole16(uint16_t host_16bits);
uint16_t be16toh(uint16_t big_endian_16bits);
uint16_t le16toh(uint16_t little_endian_16bits);

uint32_t htobe32(uint32_t host_32bits);
uint32_t htole32(uint32_t host_32bits);
uint32_t be32toh(uint32_t big_endian_32bits);
uint32_t le32toh(uint32_t little_endian_32bits);

uint64_t htobe64(uint64_t host_64bits);
uint64_t htole64(uint64_t host_64bits);
uint64_t be64toh(uint64_t big_endian_64bits);
uint64_t le64toh(uint64_t little_endian_64bits);

Edited, less reliable solution. You can use union to access the bytes in any order. It's quite convenient:

union {
    short number;
    char bytes[sizeof(number)];
};
Rafał Rawicki
  • 22,324
  • 5
  • 59
  • 79
5

Do I need to find/write my own swap byte order function?

Yes you do. But, to make it easy, I refer you to this question: How do I convert between big-endian and little-endian values in C++? which gives a list of compiler specific byte order swap functions, as well as some implementations of byte order swap functions.

Community
  • 1
  • 1
David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
1

The ntoh functions can swap between more than just big and little endian. Some systems are also 'middle endian' where the bytes are scrambled up rather than just ordered one way or another.

Anyway, if all you care about are big and little endian, then all you need to know is if the host and the object file's endianess differ. You'll have your own function which unconditionally swaps byte order and you'll call it or not based on whether or not host_endianess()==objectfile_endianess().

bames53
  • 86,085
  • 15
  • 179
  • 244
0

If I would think about a cross-platform solution that would work on windows or linux, I would write something like:

#include <algorithm>

// dataSize is the number of bytes to convert.
char le[dataSize];// little-endian
char be[dataSize];// big-endian

// Fill contents in le here...
std::reverse_copy(le, le + dataSize, be);
Mohammed Safwat
  • 191
  • 2
  • 8