1

(This question came out of explaining the details of CHAR_BIT, sizeof, and endianness to someone yesterday. It's entirely hypothetical.)

Let's say I'm on a platform where CHAR_BIT is 32, so sizeof(char) == sizeof(short) == sizeof(int) == sizeof(long). I believe this is still a standards-conformant environment.

The usual way to detect endianness at runtime (because there is no reliable way to do it at compile time) is to make a union { int i, char c[sizeof(int)] } x; x.i = 1 and see whether x.c[0] or x.c[sizeof(int)-1] got set.

But that doesn't work on this platform, as I end up with a char[1].

Is there a way to detect whether such a platform is big-endian or little-endian, at runtime? Obviously it doesn't matter inside this hypothetical system, but one can imagine it is writing to a file, or some kind of memory-mapped area, which another machine reads and reconstructs it according to its (saner) memory model.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • you could try autoconf's `AC_C_BIGENDIAN` macro to do this at compile time, it's fairly robust, resorting to grep on a test binary if it can't run the code directly. – Hasturkun Jan 17 '11 at 11:43

8 Answers8

4

Hypothetically, in an environment where all data types are of the same size, there is no endianess.

I can see three possibilities for you:

  • If the limitation is only given by C compiler and technically the platform has some smaller data types, you can use assembler to detect endianess.

  • If the platform supports double data type, you can use it to detect endianess, because it is always 64 bits wide.

  • Also you can write data to file, as you suggested, and then read back. Simply write two chars (file in binary mode), move file pointer to position 1 and read back a char.

Al Kepp
  • 5,831
  • 2
  • 28
  • 48
  • >>Also you can write data to file, as you suggested, and then read back. ... not sure how that would be implemented if the system has no means to address anything below 32bits. My understanding about such system is that one byte is actually 32bits and file position of 1 (zero based) is the same file position of 4 (where bytes consist of 8 bits) – bestsss Jan 17 '11 at 11:35
  • 2
    in general, there is no platform-wide concept of endianness: if you use a `double` value to check the endianness, you end up with the endianness of that type, and nothing more; also, I don't understand your algorithm for endianness-detection by writing to disk: standard-io is byte-oriented, and you won't get any more information out of it than by converting to `char *` – Christoph Jan 17 '11 at 11:37
  • I would like to see, how TCP/IP sockets would be implemented on such 32bit platform. – Al Kepp Jan 17 '11 at 11:43
  • Also writing to a file can be implemented in any way, i.e. the bits of the that char/integer can be saved differently than the byte order of memory. – bestsss Jan 17 '11 at 11:44
  • @Al: My 1st guess would be TCP/IP, but then the outcome would be just network format for the outgoing bits. There is not problem to bit-twiddle to the network card w/o saying anything about the memory model. I'm inclined to say that detecting the endianness is not possible. – bestsss Jan 17 '11 at 11:45
  • Christoph: If you have a binary file stream, you write to it whole chunks of memory, not byte after byte, because operating system do it in larger memory blocks. Writing byte after byte would bring a computational overhead. That's why I expect it can be used to detect endianess. I just expect, cannot be sure, as this is way too hypothetical. – Al Kepp Jan 17 '11 at 11:46
  • @Al: Who knows if TCP/IP exists on this platform at all? It's not mandated by the C standard. – Martin B Jan 17 '11 at 11:46
  • @Al even writing whole memory chunks doesn't allow to address 8bits which you essentially do as moving the pointer to position 1. – bestsss Jan 17 '11 at 11:53
  • @bestsss: When you have your file written to disk, you can move file pointer to position 1. Files consist of real 8bit bytes, they don't consist of C-language variables. Interesting point is that files may be the only place where you can see the endianess of this hypothetical environment. Maybe the endianess of file system is just artificially constructed as you wrote earlier by the imlpementation of disk operations, but still it's the endianess of our hypothetical system. Oh maybe we should rather stop this discussion. I'm getting lost in it. ;-) – Al Kepp Jan 17 '11 at 16:34
4

The concept of endianness only makes sense for scalar types represented by multiple bytes. By definition, a char is single-byte, so any integer type having the same size as char has no endianness.

Also keep in mind that endianness can be different for different types: For example, the PDP-11 is a little-endian 16-bit architecture, and 32-bit integers are built from two 16-bit little-endian values, ending up mixed endian.

Christoph
  • 164,997
  • 36
  • 182
  • 240
  • Your comment on Martin's answer seems to me the best explanation. The endianness in this system only has meaning as soon as we need to translate it to another system - whatever that translation step is can't be written in the same C environment - ergo the question is meaningless for this platform, even if, physically, the bytes/bits do have some underlying order that the translation layer can check. –  Jan 17 '11 at 15:55
  • endianness (aka byte order) is a concept independant of the notion of octets - historically, bytes come in various sizes (4-bit BCD, 6-bit graphic set, 16-bit DSPs), and still today there are special-purpose devices with `CHAR_BIT != 8`; as long as there are multibyte scalar types, endianness is a well-defined concept; serialization to 8-bit devices is irrelevant from this point of view – Christoph Jan 17 '11 at 16:24
2

I suppose the question boils down to: Is endianness even a meaningful concept on this platform, i.e. does it have a detectable influence on the behaviour of a program?

If it doesn't, then endianness (i.e. the order in which the individual bytes making up a 32-bit quantity are stored in memory) is merely an implementation detail which you can't detect, but which needn't concern you either.

If endianness does have a detectable influence on the behaviour of a certain aspect of the language... well, then construct a test based on that behaviour. Here are some examples:

  • How does the addressing system on the platform work? If you increment an address by one, to how many bits does that correspond? If the answer is eight bits and the system allows you to read from addresses that aren't a multiple of four, then you can move a pointer forward by a single byte (via a detour to intptr_t) and test for endianness that way. (Yes, this is implementation-defined behaviour, but so is using a union to test for endianness as well as the whole concept of endianness in general.) If, however, the smallest addressable unit of memory is 32 bytes, then you can't test for endianness this way.

  • Does the platform have a 64-bit data type (a long long)? Then you can create a union of a long long and two ints and construct your endianness-test based on that.

Martin B
  • 23,670
  • 6
  • 53
  • 72
  • Of course you can detect it. Set a 2 byte integer to 1 and see which of the bytes is non-zero. – David Heffernan Jan 17 '11 at 11:14
  • >>Let's say I'm on a platform where CHAR_BIT is 32, so sizeof(char) == sizeof(short) == sizeof(int) == sizeof(long). I believe this is still a standards-conformant environment.<< the byte/char is exactly big as integer. – bestsss Jan 17 '11 at 11:18
  • @David Heffernan> you cannot see a "byte". There are no "bytes" in this C environment. :-) – Al Kepp Jan 17 '11 at 11:22
  • 1
    @Al Kepp: sure there are bytes - it's just that all integer types are single-byte... – Christoph Jan 17 '11 at 11:39
  • It seems to me that endianness must be a meaningful concept at some level, if this architecture is ever going to send information over the wire to another computer with standard-sized bytes. I'm curious if a program on this system could conceivably choose whether to send (what is to the other computer) "little endian" or "big endian", or if the receiving computer would need to do the translation. –  Jan 17 '11 at 11:52
  • 1
    @Joe: an architecture with 32-bit bytes can't do the necessary conversions to feed an 8-bit wire in C; it needs an external translation unit, and the endianness you derive from this is not really a property of the C execution environment – Christoph Jan 17 '11 at 12:07
0

Structure bitfields should still work properly on any platform, even this one:

union {
   int i;
   struct {
       int b1 : 8;
       int b2 : 8;
       int b3 : 8;
       int b4 : 8;
   } s;
} u;

u.i = 1;
if (u.s.b1 != 0)
   ....
SoapBox
  • 20,457
  • 3
  • 51
  • 87
  • 2
    Will not work because compilers (at least the ones I've worked with) use the bytes for the bitfields in memory order. Thus, both with little-endian and big-endian platform, `u.b1 == 1` will be true. (This solution will not work even with bitfields of 1 bit.) – Didier Trosset Jan 17 '11 at 11:26
  • The order of bits is machine dependent IIRC, so I'm not quite sure that would work – Hasturkun Jan 17 '11 at 11:28
0

The question doesn't offer enough info but w/ what it does offer I'd say: not possible.

bestsss
  • 11,796
  • 3
  • 53
  • 63
0

htonl should work:

if(htonl(1)==1) {
   /* network order (big-endian) */
} else {
   /* little-endian */
}

(I can't see any reason not to implement htonl and friends in the usual fashion, even for this annoying hypothetical system -- though I'm not sure how much help they would turn out to be in practice.)

0

You could shift the variable in question and see which end the 1 shifts off of.

Sparr
  • 7,489
  • 31
  • 48
-1

If your code relies on knowing whether architecture is BE or LE then your code should have a safety and not compile for unknown platform. Some #if ! defined(ARCH_IS_LE) && ! defined(ARCH_IS_BE) \ #error unknown architecture should do.

wilx
  • 17,697
  • 6
  • 59
  • 114