237

Is there a programmatic way to detect whether or not you are on a big-endian or little-endian architecture? I need to be able to write code that will execute on an Intel or PPC system and use exactly the same code (i.e., no conditional compilation).

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Jay T
  • 2,379
  • 3
  • 14
  • 3
  • 4
    For the sake of completeness, here is a link to someone else's question about trying to gauge endianness (at compile time): http://stackoverflow.com/questions/280162/is-there-a-way-to-do-a-c-style-compile-time-assertion-to-determine-machines-en – Faisal Vali Jun 16 '09 at 13:52
  • 23
    Why not determine endianness at compile-time? It can't possibly change at runtime. – ephemient Jun 20 '09 at 20:31
  • 4
    AFAIK, there's no reliable and universal way to do that. http://gcc.gnu.org/ml/gcc-help/2007-07/msg00342.html – user48956 Aug 13 '10 at 21:35
  • You can't run the same code on Intel and PPC - you'll definitely have to compile separate binaries for each platform. – Toby Speight Feb 22 '23 at 09:52

29 Answers29

181

I don't like the method based on type punning - it will often be warned against by compiler. That's exactly what unions are for!

bool is_big_endian(void)
{
    union {
        uint32_t i;
        char c[4];
    } bint = {0x01020304};

    return bint.c[0] == 1;
}

The principle is equivalent to the type case as suggested by others, but this is clearer - and according to C99, is guaranteed to be correct. GCC prefers this compared to the direct pointer cast.

This is also much better than fixing the endianness at compile time - for OSes which support multi-architecture (fat binary on Mac OS X for example), this will work for both ppc/i386, whereas it is very easy to mess things up otherwise.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
David Cournapeau
  • 78,318
  • 8
  • 63
  • 70
  • 60
    I don't recommend naming a variable "bint" :) – Matt K Jun 16 '09 at 13:22
  • 57
    are you sure this is well defined? In C++ only one member of the union can be active at one time - i.e you can not assign using one member-name and read using another (although there is an exception for layout compatible structs) – Faisal Vali Jun 16 '09 at 13:46
  • 1
    It is well defined in C99 AFAIK. On older platforms, it is implementation-dependent. But so is type punning through pointer cast, which is not defined in C99 either. – David Cournapeau Jun 16 '09 at 14:33
  • 37
    @Matt: I looked into Google, and bint seems to have a meaning in English that I was not aware of :) – David Cournapeau Jun 16 '09 at 14:34
  • 3
    @Faisal: In C++ only one member is *guaranteed* to work at a time, but in practice all compilers implement the extension that you can read from unions "as if" they had been assigned to with the value having the same storage representation that the value you actually assigned has. Assuming of course that the type you read does have a value with that storage representation. Certainly if all the questioner cares about is intel and PPC, and he's using normal compilers, then this is fine. – Steve Jessop Jun 16 '09 at 14:35
  • @David, @Matt: "two cultures separated by a common language" seems apropos here... – RBerteig Jun 16 '09 at 17:30
  • 19
    I've tested this, and in both gcc 4.0.1 and gcc 4.4.1 the result of this function can be determined at compile time and treated as a constant. This means that the compiler will drop if branches that depend solely on the result of this function and will never be taken on the platform in question. This is likely not true of many implementations of htonl. – Omnifarious Sep 10 '09 at 05:25
  • @Omnifarious Hi, what do u mean by 'This means that the compiler will drop if branches that depend solely on the result of this function and will never be taken on the platform in question. ..'. ? Is this function still working? – user1559625 Mar 07 '13 at 04:32
  • 1
    @user1559625: Yes. It's never even called. The compiler figures out the result of calling it at compile time and hardcodes that result into the resulting program. In fact, if it notices that this causes dead code because a certain path isn't taken, it drops the dead code. All this happens at compile time. It's optimization the compiler is doing. And the program is still just as correct. – Omnifarious Mar 07 '13 at 06:20
  • I don't understand how this works. How will the first byte be 1? – David G May 01 '13 at 12:29
  • @0x499602D2 0x01020304, the last digit is 4(hex)=100(binary). – towry Jul 22 '14 at 15:24
  • 2
    using unions like that is completely valid, even if non-standard. Both GCC and VS guarantee that accessing parts of a value is okay. union means same memory address, different access methods, so union{ int i; char c[4];} girl; means that setting girl.c[0] will affect girl.i, BUT the "undefinedness" of behavior is, WHICH BYTE does c[0] corresponds of i? :} And this is endianess part, that is left undefined, but every game developer is aware and uses this. – Петър Петров Dec 09 '14 at 22:24
  • 1
    just need uint16_t to differ between small endian and big endian though – hanshenrik Feb 14 '15 at 03:44
  • 9
    Is this solution really portable? What if `CHAR_BIT != 8` ? – zorgit May 30 '15 at 20:12
  • 4
    @zorgit makes a good point. You should at least use `uint8_t` instead of `char`. – nwellnhof Apr 12 '16 at 09:27
  • 9
    This only works in C (by Standard), and is UB in C++. GNUC++ does support it though. – Ruslan Feb 11 '19 at 09:31
  • 3
    Wait a minute, isn't this technically undefined behaviour because you're reading from a union field that wasn't the last field assigned to? https://stackoverflow.com/a/11996970 – Pharap May 17 '19 at 17:21
  • 1
    @Пет: Another word for *"non-standard"* is *"invalid"*. That's the opposite of *"completely valid"*. The code, as posted, exhibits undefined behavior. It is reading from a union member that is not the active member. [std::bit_cast](https://en.cppreference.com/w/cpp/numeric/bit_cast) was introduced, in part, so that we no longer are tempted to write code as recommended in this answer. – IInspectable Jan 01 '20 at 22:35
112

You can use std::endian if you have access to a C++20 compiler, such as GCC 8+ or Clang 7+.

Note: std::endian began in <type_traits>, but it was moved to <bit> at the 2019 Cologne meeting. GCC 8, Clang 7, 8 and 9 have it in <type_traits> while GCC 9+ and Clang 10+ have it in <bit>.

#include <bit>

if constexpr (std::endian::native == std::endian::big)
{
    // Big-endian system
}
else if constexpr (std::endian::native == std::endian::little)
{
    // Little-endian system
}
else
{
    // Something else
}
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
  • 8
    As everyone I have access to C++17 and 20 drafts/proposals, but, as of now, does any C++20 compiler ever exist? – Xeverous Oct 29 '17 at 21:35
  • @Xeverous It only requires scoped enumerations so I suspect most vendors will add it to their stdlib implementation as one of their earlier changes. – Pharap Mar 25 '18 at 03:53
  • @Xeverous GCC 8 was released and supports it. –  May 08 '18 at 08:38
  • 2
    Out of the 30+ answers to the question, this appears to be the only one, that's completely accurate (with another answer that's at least correct in part). – IInspectable Jan 01 '20 at 22:43
89

You can do it by setting an int and masking off bits, but probably the easiest way is just to use the built in network byte conversion ops (since network byte order is always big endian).

if ( htonl(47) == 47 ) {
  // Big endian
} else {
  // Little endian.
}

Bit fiddling could be faster, but this way is simple, straightforward and pretty impossible to mess up.

Eric Petroelje
  • 59,820
  • 9
  • 127
  • 177
  • 1
    The network conversion ops can also be used to convert everything to big endian, thus solving other problems Jay may be encountering. – Brian Jun 16 '09 at 13:15
  • 1
    Care should be taken - htonl implementation can be slow - its speed needs to be measured so its misuse doesn't introduce a bottleneck. – sharptooth Jun 16 '09 at 13:15
  • 8
    @sharptooth - slow is a relative term, but yes, if speed is really an issue, use it once at the start of the program and set a global variable with the endianness. – Eric Petroelje Jun 16 '09 at 13:23
  • 7
    htonl has another problem: on some platforms (windows ?), it does not reside in the C runtime library proper, but in additional, network related libraries (socket, etc...). This is quite an hindrance for just one function if you don't need the library otherwise. – David Cournapeau Jul 07 '09 at 05:00
  • 9
    Note that on Linux (gcc), htonl is subject to constant folding at compile time, so an expression of this form has no runtime overhead at all (ie it is constant-folded to 1 or 0, and then dead-code elimination removes the other branch of the if) – bdonlan Dec 06 '11 at 20:16
  • 2
    Also, on x86 htonl can be (and is, on Linux/gcc) implemented very efficiently using inline assembler, particularly if you target a micro-architecture with support for the `BSWAP` operation. – bdonlan Dec 06 '11 at 20:18
73

Please see this article:

Here is some code to determine what is the type of your machine

int num = 1;
if(*(char *)&num == 1)
{
    printf("\nLittle-Endian\n");
}
else
{
    printf("Big-Endian\n");
}
Andrew Hare
  • 344,730
  • 71
  • 640
  • 635
  • 30
    Bear in mind that it depends on int and char being different lengths, which is almost always the case but not guaranteed. – David Thornley Jun 16 '09 at 13:45
  • 1
    @David - very true but I would be surprised to learn of any architecture that would have ints and chars be the same size. Still, it is an important point to never make assumptions about stuff like this. – Andrew Hare Jun 16 '09 at 13:51
  • Does this method rely on the fact that the code is always compiled on the same architecture? – Janusz Jun 16 '09 at 15:11
  • 12
    I've worked on embedded systems where short int and char were the same size... I can't remember if regular int was also that size (2 bytes) or not. – rmeador Jun 16 '09 at 15:14
  • @Janusz: Yes and no. This will work on any architecture in which an int and a char are are different sizes. If you are concerned about or know of any architecture where this might be the case then this solution would not work for you. – Andrew Hare Jun 16 '09 at 15:46
  • An example of identical ints and chars: http://leo.sprossenwanne.at/dsp/Entwicklungsprogramme/Entpack/CC56/DSP/INCLUDE/LIMITS.H – Dingo Jun 17 '09 at 02:03
  • @Andrew Hare: I want to test this code on a Big-endian machine. I dont have one. [Codepad](http://codepad.org/) and [Ideone](http://ideone.com/) are both Little Endian. Do you know some way? – Lazer May 15 '10 at 05:42
  • If you're using gcc, you can use `typeof` to ensure size issues don't get in the way: `typeof(1L) num = 1; if( *(char*)&num == 1L ) {...}` I just tested on Solaris/sparc with gcc 3.4.3, and Linux/x86 with GCC 4.4.5. Note, setting `-std=c99` with this generates an error since `typeof` is not part of C99. – Brian Vandenberg Sep 30 '11 at 18:16
  • 7
    why is THIS answer pretty much THE ONLY ANSWER that is NOT making me think "dude, wtf are you doing?", which is the case of most of the answers here :o – hanshenrik Feb 14 '15 at 03:46
  • 1
    @David @Andrew Hare I found this C standard from 2007 (http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf) which states that the min/max values for an int must be at least `-(2^15 - 1)` & `2^15-1`, respectively (pdf page 34). So for any system implementing the standard from about the past decade onwards (and possibly further), it should be guaranteed that `sizeof(int)>sizeof(char)`. (Any system that doesn't would have to be pretty dated.) – CrepeGoat Sep 29 '16 at 20:37
  • 3
    @Shillard int must be at least that large, but there is no requirement in the standard for char being restricted to less! If you have a look at TI F280x family, you will discover that CHAR_BIT is 16 and sizeof(int) == sizeof(char) while the limits you mention are kept absolutely fine... – Aconcagua Nov 25 '16 at 12:03
  • 14
    Why not use uint8_t and uint16_t? – Rodrigo Nov 29 '18 at 00:29
  • 1
    @DavidThornley if on your platform `sizeof(int)==sizeof(char)`, you don't have endianness issues at all. – Ruslan Feb 11 '19 at 09:34
  • @Ruslan until someone decides to use int16_t – UKMonkey Aug 12 '20 at 23:10
  • @UKMonkey in that case `CHAR_BIT==16` (being bounded from above by the existence of `int16_t` and from below by range requirements for `int`), and you don't have an issue too. – Ruslan Aug 13 '20 at 07:14
  • [Here](https://stackoverflow.com/a/12792301/6306190)'s an illustration explaining how this works. – johan Jan 24 '22 at 05:24
43

This is normally done at compile time (specially for performance reason) by using the header files available from the compiler or create your own. On Linux you have the header file "/usr/include/endian.h".

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
bill
  • 1,321
  • 11
  • 9
  • 9
    I can't believe this hasn't been voted higher. It's not like the endianness is going to change under a compiled program, so there's never any need for a runtime test. – Dolda2000 Sep 03 '14 at 09:59
  • 1
    @Dolda2000 It potentially could, see the ARM endian modes. – Tyzoid Mar 10 '16 at 21:50
  • 18
    @Tyzoid: No, a compiled program will always run under the endian mode it was compiled for, even if the processor is capable of either. – Dolda2000 Mar 11 '16 at 00:42
  • 2
    This seems to be Linux only: https://man7.org/linux/man-pages/man3/endian.3.html – RajV Nov 30 '22 at 15:10
22

Do not use a union!

C++ does not permit type punning via unions!
Reading from a union field that was not the last field written to is undefined behaviour!
Many compilers support doing so as an extension, but the language makes no guarantee.

See this answer for more details:

https://stackoverflow.com/a/11996970


There are only two valid answers that are guaranteed to be portable.

The first answer, if you have access to a system that supports C++20,
is to use std::endian from the <bit> header.

C++20 Onwards

constexpr bool is_little_endian = (std::endian::native == std::endian::little);
constexpr bool is_big_endian = (std::endian::native == std::endian::big);

Prior to C++20, the only valid answer is to store an integer and then inspect its first byte through type punning. Unlike the use of unions, this is expressly allowed by C++'s type system.

It's also important to remember that for optimum portability static_cast should be used, because reinterpret_cast is implementation defined.

If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined: [...] a char or unsigned char type.

C++11 Onwards

enum class endianness
{
    little = 0,
    big = 1,
};

inline endianness get_system_endianness()
{
    const int value { 0x01 };
    const void * address { static_cast<const void *>(&value) };
    const unsigned char * least_significant_address { static_cast<const unsigned char *>(address) };

    return (*least_significant_address == 0x01) ? endianness::little : endianness::big;
}

C++11 Onwards (with bool instead of enum class)

inline bool is_system_little_endian()
{
    const int value { 0x01 };
    const void * address { static_cast<const void *>(&value) };
    const unsigned char * least_significant_address { static_cast<const unsigned char *>(address) };

    return (*least_significant_address == 0x01);
}

C++98/C++03

inline bool is_system_little_endian()
{
    const int value = 0x01;
    const void * address = static_cast<const void *>(&value);
    const unsigned char * least_significant_address = static_cast<const unsigned char *>(address);
    return (*least_significant_address == 0x01);
}
Pharap
  • 3,826
  • 5
  • 37
  • 51
  • 2
    Pretty sure your code would fail on targets with `sizeof (int) == 1` which was at least in past allowed for C++... :D not that you'd need endianess checks there. – Antti Haapala -- Слава Україні Mar 28 '21 at 13:43
  • "*Reading from a union field that was not the last field written to is undefined behaviour!*" Except for the common initial sequence. – 303 Dec 20 '21 at 20:03
  • 1
    @303 Which is irrelevant here because _`int` and arrays of `char` or `unsigned char` do **not** share a common initial sequence_. – Pharap Feb 16 '22 at 01:18
  • The statement is missing out on context and can be quite misleading, e.g. when linking to this answer. To make it more clear, add a reference to the union solution. – 303 Feb 17 '22 at 01:05
  • @303 In what way is it misleading? The answer says quite clearly that using a union to solve the problem relies on either undefined behaviour or non-standard compiler extensions, which is correct. If people want an example of misusing a union to solve the problem, there are plenty of other answers that demonstrate that. – Pharap Feb 17 '22 at 07:43
18

I surprised no one has mentioned the macros which the pre-processor defines by default. While these will vary depending on your platform; they are much cleaner than having to write your own endian-check.

For example; if we look at the built-in macros which GCC defines (on an x86-64 machine):

:| gcc -dM -E -x c - | grep -i endian

#define __LITTLE_ENDIAN__ 1

On a PPC machine I get:

:| gcc -dM -E -x c - | grep -i endian

#define __BIG_ENDIAN__ 1
#define _BIG_ENDIAN 1

(The :| gcc -dM -E -x c - magic prints out all built-in macros.)

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
DaveR
  • 9,540
  • 3
  • 39
  • 58
  • 8
    These macros do not show up consistently at all. For example, in gcc 4.4.5 from the Redhat 6 repo, running `echo "\n" | gcc -x c -E -dM - |& grep -i 'endian'` returns nothing, whereas gcc 3.4.3 (from `/usr/sfw/bin` anyway) in Solaris has a definition along these lines. I've seen similar issues on VxWorks Tornado (gcc 2.95) -vs- VxWorks Workbench (gcc 3.4.4). – Brian Vandenberg Sep 30 '11 at 18:21
16

Ehm... It surprises me that no one has realized that the compiler will simply optimize the test out, and will put a fixed result as return value. This renders all code examples in the previous answers effectively useless.

The only thing that would be returned is the endianness at compile-time! And yes, I tested all of the examples in previous answer. Here's an example with Microsoft Visual C++ 9.0 (Visual Studio 2008).

Pure C code

int32 DNA_GetEndianness(void)
{
    union
    {
        uint8  c[4];
        uint32 i;
    } u;

    u.i = 0x01020304;

    if (0x04 == u.c[0])
        return DNA_ENDIAN_LITTLE;
    else if (0x01 == u.c[0])
        return DNA_ENDIAN_BIG;
    else
        return DNA_ENDIAN_UNKNOWN;
}

Disassembly

PUBLIC    _DNA_GetEndianness
; Function compile flags: /Ogtpy
; File c:\development\dna\source\libraries\dna\endian.c
;    COMDAT _DNA_GetEndianness
_TEXT    SEGMENT
_DNA_GetEndianness PROC                    ; COMDAT

; 11   :     union
; 12   :     {
; 13   :         uint8  c[4];
; 14   :         uint32 i;
; 15   :     } u;
; 16   :
; 17   :     u.i = 1;
; 18   :
; 19   :     if (1 == u.c[0])
; 20   :         return DNA_ENDIAN_LITTLE;

    mov    eax, 1

; 21   :     else if (1 == u.c[3])
; 22   :         return DNA_ENDIAN_BIG;
; 23   :     else
; 24   :        return DNA_ENDIAN_UNKNOWN;
; 25   : }

    ret
_DNA_GetEndianness ENDP
END

Perhaps it is possible to turn off any compile-time optimization for just this function, but I don't know. Otherwise it's maybe possible to hardcode it in assembly, although that's not portable. And even then even that might get optimized out. It makes me think I need some really crappy assembler, implement the same code for all existing CPUs/instruction sets, and well.... never mind.

Also, someone here said that endianness does not change during run-time. wrong. There are bi-endian machines out there. Their endianness can vary during execution. Also, there's not only little-endian and big-endian, but also other endiannesses.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Coriiander
  • 217
  • 2
  • 2
  • 14
    Don't you have to recompile to run on a different platform anyway? – bobobobo Aug 30 '11 at 18:56
  • 3
    Although it works well for MSVC, it doesn't for all GCC version in all circumstances. Hence, a "run-time check" inside a critical loop may be correctly un-branched at compile-time, or not. There's no 100% guarantee. – Cyan Jan 28 '12 at 10:53
  • @bobobobo Not necessarily, I could compile on a weird version of Ubuntu for a big-endian x86 processor and then put the program on Ubuntu for a little-endian x86 processor. Them both being x86 means the byte code is the same to the processor and as such, integers will be interpereted differently. – Cole Tobin Nov 17 '12 at 03:35
  • 24
    There is no such thing as a big-endian x86 processor. Even if you run Ubuntu on a biendian processor (like ARM or MIPS) the ELF executables are always either big (MSB) or little (LSB) endian. No biendian executables can be created so no runtime checks are needed. – Fabel Nov 25 '12 at 15:18
  • 7
    To turn off the optimization in this method use 'volatile union ...' It tells compiler that 'u' can be changed somewhere else and data should be loaded – mishmashru Oct 08 '14 at 09:41
  • 3
    For this function to return a different value at runtime than the optimizer is calculating that it will implies that the optimizer is bugged. Are you saying that there are examples of compiled optimized binary code that may portably run on two different architectures of different endianness, despite obvious assumptions made by the optimizer (throughout the program) during compilation that would seem to be incompatible with at least one of those architectures? – Scott Sep 12 '16 at 15:11
15

Declare an int variable:

int variable = 0xFF;

Now use char* pointers to various parts of it and check what is in those parts.

char* startPart = reinterpret_cast<char*>( &variable );
char* endPart = reinterpret_cast<char*>( &variable ) + sizeof( int ) - 1;

Depending on which one points to 0xFF byte now you can detect endianness. This requires sizeof( int ) > sizeof( char ), but it's definitely true for the discussed platforms.

sharptooth
  • 167,383
  • 100
  • 513
  • 979
7

Unless you're using a framework that has been ported to PPC and Intel processors, you will have to do conditional compiles, since PPC and Intel platforms have completely different hardware architectures, pipelines, busses, etc. This renders the assembly code completely different between the two.

As for finding endianness, do the following:

short temp = 0x1234;
char* tempChar = (char*)&temp;

You will either get tempChar to be 0x12 or 0x34, from which you will know the endianness.

samoz
  • 56,849
  • 55
  • 141
  • 195
7

For further details, you may want to check out this codeproject article Basic concepts on Endianness:

How to dynamically test for the Endian type at run time?

As explained in Computer Animation FAQ, you can use the following function to see if your code is running on a Little- or Big-Endian system: Collapse

#define BIG_ENDIAN      0
#define LITTLE_ENDIAN   1
int TestByteOrder()
{
   short int word = 0x0001;
   char *byte = (char *) &word;
   return(byte[0] ? LITTLE_ENDIAN : BIG_ENDIAN);
}

This code assigns the value 0001h to a 16-bit integer. A char pointer is then assigned to point at the first (least-significant) byte of the integer value. If the first byte of the integer is 0x01h, then the system is Little-Endian (the 0x01h is in the lowest, or least-significant, address). If it is 0x00h then the system is Big-Endian.

none
  • 5,701
  • 28
  • 32
  • 1
    That code makes several assumptions that need not be true: First - this function can and normally will be checked at compile-time only, so the result does not depend on the running architecture but only the compiling one. 2nd - this assumes that a 'short int' is 16 bit and a 'char' is 8 bit. NEITHER of which is guaranteed by the standard. They can even both be 64 bit. – ABaumstumpf Jan 21 '22 at 09:45
7

The C++ way has been to use Boost, where the preprocessor checks and casts are compartmentalized away inside very thoroughly-tested libraries.

The Predef Library (boost/predef.h) recognizes four different kinds of endianness.

The Endian Library was planned to be submitted to the C++ standard and supports a wide variety of operations on endian-sensitive data.

As stated in previous answers, Endianness will be a part of C++20.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
fuzzyTew
  • 3,511
  • 29
  • 24
6

As stated in previous answers, use union tricks.

There are a few problems with the ones advised above though. Most notably that unaligned memory access is notoriously slow for most architectures, and some compilers won't even recognize such constant predicates at all, unless word aligned.

Because a mere endian test is boring, here goes a (template) function which will flip the input/output of an arbitrary integer according to your specification, regardless of host architecture.

#include <stdint.h>

#define BIG_ENDIAN 1
#define LITTLE_ENDIAN 0

template <typename T>
T endian(T w, uint32_t endian)
{
    // This gets optimized out into if (endian == host_endian) return w;
    union { uint64_t quad; uint32_t islittle; } t;
    t.quad = 1;
    if (t.islittle ^ endian) return w;
    T r = 0;

    // Decent compilers will unroll this (GCC)
    // or even convert straight into single bswap (Clang)
    for (int i = 0; i < sizeof(r); i++) {
        r <<= 8;
        r |= w & 0xff;
        w >>= 8;
    }
    return r;
};

Usage:

To convert from given endian to host, use:

host = endian(source, endian_of_source)

To convert from host endian to given endian, use:

output = endian(hostsource, endian_you_want_to_output)

The resulting code is as fast as writing hand assembly on Clang, and on GCC it's tad slower (unrolled &,<<,>>,| for every byte), but it is still decent.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
kat
  • 814
  • 9
  • 6
4

I would do something like this:

bool isBigEndian() {
    static unsigned long x(1);
    static bool result(reinterpret_cast<unsigned char*>(&x)[0] == 0);
    return result;
}

Along these lines, you would get a time efficient function that only does the calculation once.

Christoph
  • 164,997
  • 36
  • 182
  • 240
Jeremy Mayhew
  • 187
  • 1
  • 6
4
bool isBigEndian()
{
    static const uint16_t m_endianCheck(0x00ff);
    return ( *((const uint8_t*)&m_endianCheck) == 0x0); 
}
Pharap
  • 3,826
  • 5
  • 37
  • 51
Paolo Brandoli
  • 4,681
  • 26
  • 38
3
union {
    int i;
    char c[sizeof(int)];
} x;
x.i = 1;
if(x.c[0] == 1)
    printf("little-endian\n");
else
    printf("big-endian\n");

This is another solution. Similar to Andrew Hare's solution.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Neeraj
  • 1,612
  • 7
  • 29
  • 47
3

Declare:

Non-macro, C++11 solution:

union {
  uint16_t s;
  unsigned char c[2];
} constexpr static  d {1};

constexpr bool is_little_endian() {
  return d.c[0] == 1;
}
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
zhaorufei
  • 2,045
  • 19
  • 18
  • 3
    Is there a particular reason you used unsigned char over uint8_t? – Kevin Nov 16 '14 at 20:46
  • 0 runtime overhead... i like it! – hanshenrik Sep 11 '15 at 01:05
  • 1
    I guess, this detects endiannes of the build machine, not the target? – hutorny Oct 23 '15 at 12:07
  • 4
    Isn't this UB in C++? – rr- Feb 29 '16 at 21:44
  • 10
    this is not legal in constexpr context. You can't access a member of a union that has not been initialised directly. There is no way to legally detect endianness at compile time without preprocessor magic. – Richard Hodges Apr 15 '16 at 00:24
  • @Richard Hodges Thanks, you direct me to rethink this topic, I checked my code snippet with clang 6, there's a warning about the constexpr function: error: constexpr function never produces a constant expression [-Winvalid-constexpr], at the moment I think preprocessor is the only way to get a compile time(actually preprocessor happens before compile), zero cost method to know whether the target system is big or little endian. – zhaorufei Mar 27 '19 at 11:06
  • @hanshenrik sorry, the constexpr is misleading, although it may pass gcc, it not always means "const" or zero cost. – zhaorufei Mar 27 '19 at 11:06
  • 1
    clang says: `constexpr function never produces a constant expression [-Winvalid-constexpr]` – kyb Feb 15 '20 at 14:34
3

This is untested, but in my mind, this should work. Because it'll be 0x01 on little-endian, and 0x00 on big-endian.

bool runtimeIsLittleEndian(void)
{
    volatile uint16_t i=1;
    return ((uint8_t*)&i)[0]==0x01; // 0x01=little, 0x00=big
}
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
hanshenrik
  • 19,904
  • 4
  • 43
  • 89
3

If you don't want conditional compilation you can just write endian independent code. Here is an example (taken from Rob Pike):

Reading an integer stored in little-endian on disk, in an endian independent manner:

i = (data[0]<<0) | (data[1]<<8) | (data[2]<<16) | (data[3]<<24);

The same code, trying to take into account the machine endianness:

i = *((int*)data);
#ifdef BIG_ENDIAN
/* swap the bytes */
i = ((i&0xFF)<<24) | (((i>>8)&0xFF)<<16) | (((i>>16)&0xFF)<<8) | (((i>>24)&0xFF)<<0);
#endif
fjardon
  • 7,921
  • 22
  • 31
  • Thanks a lot for this however I noticed I had to reverse it for it to work (I'm on a little endian machine (Intel corei3 9100) which was weird based on the link you provided. so for me ```(data[0]<<24) | (data[1]<<16) | (data[2]<<8) | (data[3]);``` worked! – Hossein Jan 02 '22 at 07:44
2

You can also do this via the preprocessor using something like a Boost header file which can be found in Boost endian.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
2

Unless the endian header is GCC-only, it provides macros you can use.

#include "endian.h"
...
if (__BYTE_ORDER == __LITTLE_ENDIAN) { ... }
else if (__BYTE_ORDER == __BIG_ENDIAN) { ... }
else { throw std::runtime_error("Sorry, this version does not support PDP Endian!");
...
1

See Endianness - C-Level Code illustration.

// assuming target architecture is 32-bit = 4-Bytes
enum ENDIANNESS{ LITTLEENDIAN , BIGENDIAN , UNHANDLE };


ENDIANNESS CheckArchEndianalityV1( void )
{
    int Endian = 0x00000001; // assuming target architecture is 32-bit    

    // as Endian = 0x00000001 so MSB (Most Significant Byte) = 0x00 and LSB (Least     Significant Byte) = 0x01
    // casting down to a single byte value LSB discarding higher bytes    

    return (*(char *) &Endian == 0x01) ? LITTLEENDIAN : BIGENDIAN;
} 
Cœur
  • 37,241
  • 25
  • 195
  • 267
gimel
  • 83,368
  • 10
  • 76
  • 104
0
int i=1;
char *c=(char*)&i;
bool littleendian=c;
Jon Bright
  • 13,388
  • 3
  • 31
  • 46
0

Here's another C version. It defines a macro called wicked_cast() for inline type punning via C99 union literals and the non-standard __typeof__ operator.

#include <limits.h>

#if UCHAR_MAX == UINT_MAX
#error endianness irrelevant as sizeof(int) == 1
#endif

#define wicked_cast(TYPE, VALUE) \
    (((union { __typeof__(VALUE) src; TYPE dest; }){ .src = VALUE }).dest)

_Bool is_little_endian(void)
{
    return wicked_cast(unsigned char, 1u);
}

If integers are single-byte values, endianness makes no sense and a compile-time error will be generated.

Christoph
  • 164,997
  • 36
  • 182
  • 240
0

The way C compilers (at least everyone I know of) work the endianness has to be decided at compile time. Even for biendian processors (like ARM and MIPS) you have to choose endianness at compile time.

Furthermore, the endianness is defined in all common file formats for executables (such as ELF). Although it is possible to craft a binary blob of biandian code (for some ARM server exploit maybe?) it probably has to be done in assembly.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Fabel
  • 1,711
  • 14
  • 36
0

a c++20 solution:

constexpr bool compare(auto const c, auto const ...a) noexcept
{
  return [&]<auto ...I>(std::index_sequence<I...>) noexcept
    {
      return ((std::uint8_t(c >> 8 * I) == a) && ...);
    }(std::make_index_sequence<sizeof...(a)>());
}

static constexpr auto is_big_endian_v{
  compare(std::uint32_t(0x01234567), 0x01, 0x23, 0x45, 0x67)
};

static constexpr auto is_little_endian_v{
  compare(std::uint32_t(0x01234567), 0x67, 0x45, 0x23, 0x01)
};

static constexpr auto is_pdp_endian_v{
  compare(std::uint32_t(0x01234567), 0x23, 0x01, 0x67, 0x45)
};

The task can be accomplished more easily, but for some reason the <bit> header file is not always present. Here's a demo.

user1095108
  • 14,119
  • 9
  • 58
  • 116
  • methinks the result could be wrong sometimes because of cross compilation. The host may have different endianess than the target. – user1095108 Aug 10 '22 at 12:12
-1

How about this?

#include <cstdio>

int main()
{
    unsigned int n = 1;
    char *p = 0;

    p = (char*)&n;
    if (*p == 1)
        std::printf("Little Endian\n");
    else 
        if (*(p + sizeof(int) - 1) == 1)
            std::printf("Big Endian\n");
        else
            std::printf("What the crap?\n");
    return 0;
}
Abhay
  • 7,092
  • 3
  • 36
  • 50
-1

As pointed out by Coriiander, most (if not all) of these code here will be optimized away at compilation time, so the generated binaries won't check "endianness" at run time.

It has been observed that a given executable shouldn't run in two different byte orders, but I have no idea if that is always the case, and it seems like a hack to me checking at compilation time. So I coded this function:

#include <stdint.h>

int* _BE = 0;

int is_big_endian() {
    if (_BE == 0) {
        uint16_t* teste = (uint16_t*)malloc(4);
        *teste = (*teste & 0x01FE) | 0x0100;
        uint8_t teste2 = ((uint8_t*) teste)[0];
        free(teste);
        _BE = (int*)malloc(sizeof(int));
        *_BE = (0x01 == teste2);
    }
    return *_BE;
}

MinGW wasn't able to optimize this code, even though it does optimize the other code here away. I believe that is because I leave the "random" value that was allocated on the smaller byte memory as it was (at least seven of its bits), so the compiler can't know what that random value is and it doesn't optimize the function away.

I've also coded the function so that the check is only performed once, and the return value is stored for next tests.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Tex Killer
  • 208
  • 3
  • 6
  • Why allocate 4 bytes to work on a 2-byte value? Why mask an indeterminate value with `0x7FE`? Why use `malloc()` at all? that is wasteful. And `_BE` is a (albeit small) memory leak and a race condition waiting to happen, the benefits of caching the result dynamically are not worth the trouble. I would do something more like this instead: `static const uint16_t teste = 1; int is_little_endian() { return (0x01 == ((uint8_t*)&teste)[0]); } int is_big_endian() { return (0x01 == ((uint8_t*)&teste)[1]); }` Simple and effective, and much less work to perform at runtime. – Remy Lebeau May 03 '18 at 00:08
  • @RemyLebeau, the whole point of my answer was to produce a code that isn't optimized away by the compiler. Sure, your code is much simpler, but with optimizations turned on it will just become a constant boolean after compiled. As I stated on my answer, I don't actually know if there is some way to compile C code in a way that the same executable runs on both byte orders, and I was also curious to see if I could make the check at runtime despite optimizations being on. – Tex Killer May 14 '19 at 03:22
  • @TexKiller then why not simply disable optimizations for the code? Using `volatile`, or `#pragma`, etc. – Remy Lebeau May 14 '19 at 04:23
  • @RemyLebeau, I didn't know those keywords at the time, and I just took it as a little challenge to prevent the compiler optimization with what I knew. – Tex Killer May 29 '19 at 13:27
-2

I was going through the textbook Computer System: a programmer's perspective, and there is a problem to determine which endian this is by a C program.

I used the feature of the pointer to do that as following:

#include <stdio.h>

int main(void){
    int i=1;
    unsigned char* ii = &i;

    printf("This computer is %s endian.\n", ((ii[0]==1) ? "little" : "big"));
    return 0;
}

As the int takes up four bytes, and char takes up only one byte. We could use a char pointer to point to the int with value 1. Thus if the computer is little-endian, the char that char pointer points to is with value 1, otherwise, its value should be 0.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Archimedes520
  • 191
  • 1
  • 9