14

I have referred many questions in SO on this topic, but couldn't find any solution so far. One natural solution was mentioned here: Determining endianness at compile time.
However, the related problems mentioned in the comments & the same answer.

With some modifications, I am able to compile a similar solution with g++ & clang++ (-std=c++11) without any warning.

static_assert(sizeof(char) == 1, "sizeof(char) != 1");
union U1
{
  int i;
  char c[sizeof(int)];
};  
union U2
{ 
  char c[sizeof(int)];
  int i;
};  

constexpr U1 u1 = {1};
constexpr U2 u2 = {{1}};
constexpr bool IsLittleEndian ()
{ 
  return u1.i == u2.c[0];  // ignore different type comparison
}   

static_assert(IsLittleEndian(), "The machine is BIG endian");

Demo.

Can this be considered a deterministic method to decide the endian-ness or does it miss type-punning or something else?

Cœur
  • 37,241
  • 25
  • 195
  • 267
iammilind
  • 68,093
  • 33
  • 169
  • 336
  • 2
    Doesn't `uint8_t(u2.i)` produce the same value on either endianness? A cast should be value preserving, not just pick the first byte. – Bo Persson Sep 29 '16 at 07:22
  • 1
    There are 24 possible orderings of bytes within a 4-byte integer. At *least* three have been used by real computers. Also, it is not entirely clear that the exception to the strict aliasing rules granted to [[`un`]`signed`] `char` applies to `uint8_t`. – Martin Bonner supports Monica Sep 29 '16 at 07:22
  • @BoPersson, I wanted to avoid any possible compiler warning related to "comparison of different size types" (as I try to claim in the Q!). Since here, 1 will be representable with the smallest type, I found it to be acceptable for typecasting. Or did I misunderstood your concern? I will modify the code a bit. – iammilind Sep 29 '16 at 07:28
  • 1
    I belive that if you actually run this on a big endian machine, it would still test if `1 == 1` and return `true`. – Bo Persson Sep 29 '16 at 07:32
  • 10
    `sizeof(char) == 1` is true by definition. `sizeof` is given in *units of `char`,* so this assertion can literally never fail. – Angew is no longer proud of SO Sep 29 '16 at 08:05
  • Quite sure there is no way to use `constexpr` to do this, since any `union`/`reinterpret_cast` approach invokes UB (which is caught at compile time inside a `constexpr`), and `memcpy` is not `constexpr`. Compiler specific macros are the only way around it (look for __BYTE_ORDER). – sbabbi Sep 29 '16 at 11:00
  • @sbabbi, not sure why `union` will cause UB. It doesn't generate any warning in either g++/clang++. BTW, regarding compiler specific macros, there is a platform specific file supported, ``, as mentioned in this answer: [C Macro definition to determine big endian or little endian machine?](http://stackoverflow.com/a/2100363/514235) – iammilind Sep 29 '16 at 11:02
  • 1
    @iammilind See http://stackoverflow.com/questions/11373203/accessing-inactive-union-member-and-undefined-behavior . IIRC gcc defines the behavior (of accessing a non-active union member, basically they promise they are not going to optimize on this), but it is UB in the standard. – sbabbi Sep 29 '16 at 11:15
  • One easy way to find endianness at compile time in C++, is to just use OS macro sniffing. There's a nice collection of OS-indicator macros over at some SourceForge project. Yes, as the site indicates it's old, so maybe all the Androids are not covered, but it should be doable. – Cheers and hth. - Alf Sep 29 '16 at 18:55
  • 4
    Planned proposal: http://howardhinnant.github.io/endian.html – Howard Hinnant Sep 29 '16 at 20:35
  • @HowardHinnant, nice to see the proposal. As a common C++ coder, I feel it to be little complex though. May be you can explain in that blog that why is 'simply defining `__ORDER_LITTLE_ENDIAN__` (& big, native)' not enough? IMO, enum trick is trivial and hence not needed. – iammilind Sep 30 '16 at 02:33
  • Afaik, what you want is not possible. If you *must* have that information at compile time, consider using a trivial, short test program that outputs either `const char* const ENDIANESS = "little";` or `const char* const ENDIANESS = "big";` into a file "endianess.h", which is then used by your actual source code. – cmaster - reinstate monica Nov 02 '18 at 14:31
  • Also note, that there are other byte orders than just little endian or big endian out there. Braindead stuff like `0x01020304` being stored as `0x03 0x04 0x01 0x02`. So, if I were you, I would write the test with `char[8] = {1, 2, 3, 4, 5, 6, 7, 8};`, copy over to at least a `uint64_t`, and then check for equality with either `0x0102030405060708` or `0x0807060504030201`. If neither test succeeds, you should probably error out *hard*. – cmaster - reinstate monica Nov 02 '18 at 14:36

2 Answers2

3

Since C++20 you can use std::endian from the <type_traits> header:

#include <type_traits>

int main()
{
    static_assert(std::endian::native==std::endian::big,
                  "Not a big endian platform!");
}

See it live

Ruslan
  • 18,162
  • 8
  • 67
  • 136
2

Your attempt is no different from this obviously non-working one (where IsLittleEndian() is identical to true):

constexpr char c[sizeof(int)] = {1};
constexpr int i = {1};
constexpr bool IsLittleEndian ()
{ 
  return i == c[0];  // ignore different type comparison
}   

static_assert(IsLittleEndian(), "The machine is BIG endian");

I believe that C++11 doesn't provide means to programatically determine the endianness of the target platform during compile time. My argument is that the only valid way to perform that check during runtime is to examine an int variable using an unsigned char pointer (since other ways of type punning inevitably contain undefined behavior):

const uint32_t i = 0xffff0000;

bool isLittleEndian() {
    return 0 == *reinterpret_cast<const unsigned char*>(&i);
}

C++11 doesn't allow to make this function constexpr, therefore this check cannot be performed during compile time.

Leon
  • 31,443
  • 4
  • 72
  • 97
  • Can you explain in more detail about why the attempted solution will not work? Or why is it same as the one you mentioned (non-working) in beginning of answer. I haven't got a chance to check on a big-endian machine, however I assume that what you said is true. But the question is Why? – iammilind Sep 29 '16 at 09:47
  • @iammilind What purpose do the unions serve in your "solution"? You never access/use `U1::c` and you never access/use `U2::i`. Hence, after eliminating them we arrive at my version. – Leon Sep 29 '16 at 09:49
  • The 2 unions are to just eliminate compiler error, which comes otherwise with 1 union. If you check `u1.i == u1.c[0]` (same for `u2`) then it works fine. This solution is in many other answers in SO. But those are limited to runtime. They don't work compile time because of `constexpr` limitations. Here U1 & U2 act as mirror to each other. I have attempted to trick the compiler to allow it for compile time. May be it is wrong. But good if someone checks it in big endian. – iammilind Sep 29 '16 at 09:56
  • @iammilind I understand that. But if you forget the prehistory of how you arrived at your version, doesn't the existence of `U1::c` and `U2::i` appear to be completely artificial? – Leon Sep 29 '16 at 09:59
  • 2
    @iammilind: The way I understand the initialization in `U2 u2 = {{1}}` is that it always sets first item of array to `1`. Doesn't it? – Cheers and hth. - Alf Sep 29 '16 at 16:36
  • @Cheers, correct. But now I do understand that without making a comparison with other member of `union`, the purpose is not served. And according to C++11, accessing a union member which is *not* set latest, is undefined; though supported by many. For that much part we have to have a .c file/function, which will at least give runtime type info. – iammilind Sep 30 '16 at 02:35
  • 2
    BTW, C++ 20 finally addresses this : https://en.cppreference.com/w/cpp/types/endian – Wheezil Nov 02 '18 at 13:54