Endianness swap without ntohs

Question

I am writing an ELF analyzer, but I'm having some trouble converting endianness properly. I have functions to determine the endianness of the analyzer and the endiannness of the object file.

Basically, there are four possible scenarios:

A big endian compiled analyzer run on a big endian object file
- nothing needs converted
A big endian compiled analyzer run on a little endian object file
- the byte order needs swapped, but ntohs/l() and htons/l() are both null macros on a big endian machine, so they won't swap the byte order. This is the problem
A little endian compiled analyzer run on a big endian object file
- the byte order needs swapped, so use htons() to swap the byte order
A little endian compiled analyzer run on a little endian object file.
- nothing needs converted

Is there a function I can use to explicitly swap byte order/change endianness, since ntohs/l() and htons/l() take the host's endianness into account and sometimes don't convert? Or do I need to find/write my own swap byte order function?

score 14 · Answer 1 · answered Apr 27 '12 at 06:49

I think it's worth raising The Byte Order Fallacy article here, by Rob Pyke (one of Go's author).

If you do things right -- ie you do not assume anything about your platforms byte order -- then it will just work. All you need to care about is whether ELF format files are in Little Endian or Big Endian mode.

From the article:

Let's say your data stream has a little-endian-encoded 32-bit integer. Here's how to extract it (assuming unsigned bytes):

i = (data[0]<<0) | (data[1]<<8) | (data[2]<<16) | (data[3]<<24);

If it's big-endian, here's how to extract it:

i = (data[3]<<0) | (data[2]<<8) | (data[1]<<16) | (data[0]<<24);

And just let the compiler worry about optimizing the heck out of it.

AFAIK compilers will only use an optimised byte-order swap if you start with a word in the first place. — Randy the Dev, Feb 08 '17 at 14:12
@AndrewDunn: Quite possibly, but as usual, measure twice, optimize once. — Matthieu M., Feb 08 '17 at 14:23

Rafał Rawicki · Answer 2 · 2012-04-26T21:21:19.053

8

In Linux there are several conversion functions in endian.h, which allow to convert between arbitrary endianness:

uint16_t htobe16(uint16_t host_16bits);
uint16_t htole16(uint16_t host_16bits);
uint16_t be16toh(uint16_t big_endian_16bits);
uint16_t le16toh(uint16_t little_endian_16bits);

uint32_t htobe32(uint32_t host_32bits);
uint32_t htole32(uint32_t host_32bits);
uint32_t be32toh(uint32_t big_endian_32bits);
uint32_t le32toh(uint32_t little_endian_32bits);

uint64_t htobe64(uint64_t host_64bits);
uint64_t htole64(uint64_t host_64bits);
uint64_t be64toh(uint64_t big_endian_64bits);
uint64_t le64toh(uint64_t little_endian_64bits);

Edited, less reliable solution. You can use union to access the bytes in any order. It's quite convenient:

union {
    short number;
    char bytes[sizeof(number)];
};

edited Apr 26 '12 at 21:21

answered Apr 26 '12 at 21:12

Rafał Rawicki

22,324
5
59
79

Technically undefined behavior in C++ though. – bames53 Apr 26 '12 at 21:12
But how do we know the proper order? – Bo Persson Apr 26 '12 at 21:13
@BoPersson OP knows, when he wants to swap bytes. I've edited my answer to expose the more proper solution. – Rafał Rawicki Apr 26 '12 at 21:24
3

This is really useful, but unfortunately (my fault for not saying) my program needs to work on *nix machines that may not have this (read: solaris) available. Upvote for most simple, but I've accepted the other, as it is most portable. – xdumaine May 02 '12 at 12:52

score 5 · Accepted Answer · edited May 23 '17 at 10:29

5

Do I need to find/write my own swap byte order function?

Yes you do. But, to make it easy, I refer you to this question: How do I convert between big-endian and little-endian values in C++? which gives a list of compiler specific byte order swap functions, as well as some implementations of byte order swap functions.

edited May 23 '17 at 10:29

Community

1
1

answered Apr 26 '12 at 21:12

David Heffernan

601,492
42
1,072
1,490

score 1 · Answer 4 · answered Apr 26 '12 at 21:17

The ntoh functions can swap between more than just big and little endian. Some systems are also 'middle endian' where the bytes are scrambled up rather than just ordered one way or another.

Anyway, if all you care about are big and little endian, then all you need to know is if the host and the object file's endianess differ. You'll have your own function which unconditionally swaps byte order and you'll call it or not based on whether or not host_endianess()==objectfile_endianess().

score 0 · Answer 5 · answered Apr 27 '12 at 02:27

If I would think about a cross-platform solution that would work on windows or linux, I would write something like:

#include <algorithm>

// dataSize is the number of bytes to convert.
char le[dataSize];// little-endian
char be[dataSize];// big-endian

// Fill contents in le here...
std::reverse_copy(le, le + dataSize, be);

Endianness swap without ntohs

5 Answers5