58

(I guess this question could apply to many typed languages, but I chose to use C++ as an example.)

Why is there no way to just write:

struct foo {
    little int x;   // little-endian
    big long int y; // big-endian
    short z;        // native endianness
};

to specify the endianness for specific members, variables and parameters?

Comparison to signedness

I understand that the type of a variable not only determines how many bytes are used to store a value but also how those bytes are interpreted when performing computations.

For example, these two declarations each allocate one byte, and for both bytes, every possible 8-bit sequence is a valid value:

signed char s;
unsigned char u;

but the same binary sequence might be interpreted differently, e.g. 11111111 would mean -1 when assigned to s but 255 when assigned to u. When signed and unsigned variables are involved in the same computation, the compiler (mostly) takes care of proper conversions.

In my understanding, endianness is just a variation of the same principle: a different interpretation of a binary pattern based on compile-time information about the memory in which it will be stored.

It seems obvious to have that feature in a typed language that allows low-level programming. However, this is not a part of C, C++ or any other language I know, and I did not find any discussion about this online.

Update

I'll try to summarize some takeaways from the many comments that I got in the first hour after asking:

  1. signedness is strictly binary (either signed or unsigned) and will always be, in contrast to endianness, which also has two well-known variants (big and little), but also lesser-known variants such as mixed/middle endian. New variants might be invented in the future.
  2. endianness matters when accessing multiple-byte values byte-wise. There are many aspects beyond just endianness that affect the memory layout of multi-byte structures, so this kind of access is mostly discouraged.
  3. C++ aims to target an abstract machine and minimize the number of assumptions about the implementation. This abstract machine does not have any endianness.

Also, now I realize that signedness and endianness are not a perfect analogy, because:

  • endianness only defines how something is represented as a binary sequence, but now what can be represented. Both big int and little int would have the exact same value range.
  • signedness defines how bits and actual values map to each other, but also affects what can be represented, e.g. -3 can't be represented by an unsigned char and (assuming that char has 8 bits) 130 can't be represented by a signed char.

So that changing the endianness of some variables would never change the behavior of the program (except for byte-wise access), whereas a change of signedness usually would.

Farzad Karimi
  • 770
  • 1
  • 12
  • 31
Lena Schimmel
  • 7,203
  • 5
  • 43
  • 58
  • 53
    Because C++ describe the behaviour of an abstract machine which has no concept of endianness? – YSC Nov 28 '17 at 12:04
  • 1
    I don't think _"low-level programming"_ as used in the context of C/C++ means what you think it means. For instance, you assume a `char` has 8 bits. That's only a minimum, not a fixed requirement. You also assume the signed version is 2's complement, but that's not required at all. C/C++ are only "low level" to the extent that's useful for spec-writers and practical for implementors; beyond that, they can - and must - make as much use of abstraction as any other language. Most programmers will never know or care about endianness, sign representation, etc.; they just want to make things happen. – underscore_d Nov 28 '17 at 12:05
  • 1
    If you do want to play around with endianness, you can check out Boost's [Endian](http://www.boost.org/doc/libs/1_59_0/libs/endian/doc/index.html) library – AndyG Nov 28 '17 at 12:08
  • 19
    Ok, I'm going to introduce a new concept of Endianness - I'm calling it 'reverse Big Endian' in that it's big Endian, but the bit order is reversed, not the byte order. You want the whole language to change just for my new architecture? – UKMonkey Nov 28 '17 at 12:11
  • @UKMonkey: you cannot address bits in C++, so I don't really see where your reverse big endian differs from big endian. – geza Nov 28 '17 at 12:29
  • I'd say it would have limited usefulness. Endian conversions only needed very rarely, there's no real need for them in the language. – geza Nov 28 '17 at 12:31
  • 15
    @geza UKMonkey was being sarcastic. His point: endianness depends on the architecture and everybody, including UKMonkey under acids, can design a new architecture. The C++ language should not take SO users under acids into consideration. – YSC Nov 28 '17 at 12:34
  • 1
    Possible duplicate of [Is C Endian neutral?](https://stackoverflow.com/questions/35371745/is-c-endian-neutral) – underscore_d Nov 28 '17 at 12:35
  • @YSC: Unnecessary sarcasm. We don't have too much endianness to choose from in the last 20 years. – geza Nov 28 '17 at 12:38
  • 7
    I can't see how this is an obvious feature. What problem would it solve? – molbdnilo Nov 28 '17 at 12:39
  • 8
    I think it's safe to say that the concept of a sign could be considered abstract, while endianess is very much implementation specific. A better comparison would be with alignment specifications, I reckon. – StoryTeller - Unslander Monica Nov 28 '17 at 12:42
  • 4
    @geza any time someone has made an assumption about what will happen in the future, they've been burnt. Y2K date stamp was a perfect example of what happens when you say "but it works for now". With quantum computers constantly improving now, how can you possibly predict how they'll want to store data? – UKMonkey Nov 28 '17 at 12:46
  • @molbdnilo: it is an obvious feature. The standard could have defined a (maybe optional) feature, how the underlying bytes should be laid out. It doesn't affect the language too much. It could help processing data coming from other endianness. And I bet, if this feature needed a lot, there would be a support for it. – geza Nov 28 '17 at 12:47
  • @UKMonkey: so? How these are relevant? I just say that we could have a support for "big_endian"/"little_endian", which signs, how a number should stored in memory. It is a platform-neutral thing, can be implemented on all machines. – geza Nov 28 '17 at 12:49
  • @geza How is it "platform-neutral"? Endianness is predicated on the existence of bytes, i.e. data being divided into defined units. What if they bring out a ternary architecture that doesn't have bytes, just strings of bits, separated by a new 3rd digit? How can one talk about, much less implement endianness in that case? Doing so would be a ball and chain for the design and evolution of the language into other spaces. – underscore_d Nov 28 '17 at 12:53
  • @underscore_d: Would that machine compatible with C++ at all? If yes, then I think with bit operations, it could process current little/big endian data. – geza Nov 28 '17 at 12:58
  • @geza Sure, it could, but that would be requiring the language to support specific representations of data, which I thought was the thing we were trying to avoid by defining only an abstract machine - not to mention that fact that we would then be requiring it to support specific _and non-native_ representations, which is another bridge beyond that. – underscore_d Nov 28 '17 at 13:00
  • 2
    @underscore_d: yes, exactly. On the other hand, when we need to handle actual low-level stuff, this abstract machine gets in the way. So C++ could have support for this, it would not hurt at all. But as I've said, this feature is rarely needed, it doesn't worth it (in my opinion). If this would be an often used feature, C++ compilers could have it as an extension. But no C++ compilers I know implemented this. – geza Nov 28 '17 at 13:05
  • 2
    @geza I see your point. I don't feel that the abstract machine concepts really gets in the way though; in all cases I've seen, it's near-trivial to code our own routines to read and write data with specific endiannesses or any other implementation details, and I don't think that small amount of work for the relatively few programmers who need such things justifies the hassle it would incur on the committee, vendors, etc. – underscore_d Nov 28 '17 at 13:09
  • @underscore_d: absolutely, I agree with you (basically that's what I tried to say). – geza Nov 28 '17 at 13:12
  • Wow, this led to more forum-like discussion than I had anticipated. Maybe this is about the boundary between hardware specifics that the language / compiler hide completely from the programmer, and details that cannot be hidden and therefor need some representation in the language. I think I understand why signedness falls onto one side of that boundary and endianness falls onto the other. – Lena Schimmel Nov 28 '17 at 13:30
  • @LenaSchimmel In what sense is the representation of signedness any less of an abstract, opaque implementation detail? (Of course, I exclude the optional `[u]intN_t` exact-width types, which must be 2's complement, _if_ they exist) – underscore_d Nov 28 '17 at 13:36
  • @underscore_d related: [How does std::cout print negative zero in a ones-complement system?](https://stackoverflow.com/q/33151068/5470596). Sometimes, the standard is unclear about where it stops to specify the representation of a signed integer. – YSC Nov 28 '17 at 13:40
  • I've read your update. In my opinion, this is informative enough to be an answer. I'll upvote it. – YSC Nov 28 '17 at 15:09
  • @geza: Why can't you address bits in C? Admittedly I'd rather use C for that sort of thing, but doesn't C++ support all of C's bitwise operators? Including bit fields, apparently: https://stackoverflow.com/questions/4240974/when-is-it-worthwhile-to-use-bit-fields – jamesqf Nov 28 '17 at 18:58
  • @jamesqf: "reverse big endian" only would make sense on a machine, where you can address individual bits. Bitwise operators are not for addressing bits. "Addressing bits" means "give me the **indexth** bit". For example, if you write 128 into a byte, "0th bit" is 1 for a "reverse big endian" machine, and 0 for a "normal" machine (or the other way around). – geza Nov 28 '17 at 19:07
  • @geza: Getting a bit off track, but IIRC back in my bit-fiddling days I used to do exactly that, including swapping stuff around depending on the endianness of the machine. – jamesqf Nov 28 '17 at 23:33
  • I don't know if this contributes to the discussion(s) but it hasn't been mentioned anywhere on this page. There is some proposal about endianness by Howard Hinnant : http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0463r1.html – engf-010 Nov 29 '17 at 02:14
  • 3
    C has signed and unsigned integers because PDP-11 assembly supports signed and unsigned integer instructions. C doesn't support multiple byte orders because PDP-11 assembly doesn't support multiple byte orders. The abstract machine is a model of some very concrete machines –  Nov 29 '17 at 05:19
  • @molbdnilo : Problems like this one: https://stackoverflow.com/questions/6732127/is-there-a-way-to-enforce-specific-endianness-for-a-c-or-c-struct – vsz Nov 29 '17 at 10:56
  • 2
    I'd love to have types like `big uint16_t`: This would allow sending structs over a network without worrying about portability. No need to serialize the struct first, just let the compiler do its thing. – cmaster - reinstate monica Nov 29 '17 at 12:24
  • @LenaSchimmel are you still unsatisfied with any answer? If it is so, maybe you should suggest improvements or sub-problems still to be answered. – YSC Apr 08 '18 at 22:28

9 Answers9

53

What the standard says

[intro.abstract]/1:

The semantic descriptions in this document define a parameterized nondeterministic abstract machine. This document places no requirement on the structure of conforming implementations. In particular, they need not copy or emulate the structure of the abstract machine. Rather, conforming implementations are required to emulate (only) the observable behavior of the abstract machine as explained below.

C++ could not define an endianness qualifier since it has no concept of endianness.

Discussion

About the difference between signness and endianness, OP wrote

In my understanding, endianness is just a variation of the same principle [(signness)]: a different interpretation of a binary pattern based on compile-time information about the memory in which it will be stored.

I'd argue signness both have a semantic and a representative aspect1. What [intro.abstract]/1 implies is that C++ only care about semantic, and never addresses the way a signed number should be represented in memory2. Actually, "sign bit" only appears once in the C++ specs and refer to an implementation-defined value.
On the other hand, endianness only have a representative aspect: endianness conveys no meaning.

With C++20, std::endian appears. It is still implementation-defined, but let us test the endian of the host without depending on old tricks based on undefined behaviour.


1) Semantic aspect: an signed integer can represent values below zero; representative aspect: one need to, for example, reserve a bit to convey the positive/negative sign.
2) In the same vein, C++ never describe how a floating point number should be represented, IEEE-754 is often used, but this is a choice made by the implementation, in any case enforced by the standard: [basic.fundamental]/8 "The value representation of floating-point types is implementation-defined".

YSC
  • 38,212
  • 9
  • 96
  • 149
36

In addition to YSC's answer, let's take your sample code, and consider what it might aim to achieve

struct foo {
    little int x;   // little-endian
    big long int y; // big-endian
    short z;        // native endianness
};

You might hope that this would exactly specify layout for architecture-independent data interchange (file, network, whatever)

But this can't possibly work, because several things are still unspecified:

  • data type size: you'd have to use little int32_t, big int64_t and int16_t respectively, if that's what you want
  • padding and alignment, which cannot be controlled strictly within the language: use #pragma or __attribute__((packed)) or some other compiler-specific extension
  • actual format (1s- or 2s-complement signedness, floating-point type layout, trap representations)

Alternatively, you might simply want to reflect the endianness of some specified hardware - but big and little don't cover all the possibilities here (just the two most common).

So, the proposal is incomplete (it doesn't distinguish all reasonable byte-ordering arrangements), ineffective (it doesn't achieve what it sets out to), and has additional drawbacks:

  • Performance

    Changing the endianness of a variable from the native byte ordering should either disable arithmetic, comparisons etc (since the hardware cannot correctly perform them on this type), or must silently inject more code, creating natively-ordered temporaries to work on.

    The argument here isn't that manually converting to/from native byte order is faster, it's that controlling it explicitly makes it easier to minimise the number of unnecessary conversions, and much easier to reason about how code will behave, than if the conversions are implicit.

  • Complexity

    Everything overloaded or specialized for integer types now needs twice as many versions, to cope with the rare event that it gets passed a non-native-endianness value. Even if that's just a forwarding wrapper (with a couple of casts to translate to/from native ordering), it's still a lot of code for no discernible benefit.

The final argument against changing the language to support this is that you can easily do it in code. Changing the language syntax is a big deal, and doesn't offer any obvious benefit over something like a type wrapper:

// store T with reversed byte order
template <typename T>
class Reversed {
    T val_;
    static T reverse(T); // platform-specific implementation
public:
    explicit Reversed(T t) : val_(reverse(t)) {}
    Reversed(Reversed const &other) : val_(other.val_) {}
    // assignment, move, arithmetic, comparison etc. etc.
    operator T () const { return reverse(val_); }
};
Useless
  • 64,155
  • 6
  • 88
  • 132
  • 3
    I don't really see, how Performance is an argument. If you need to use other endianness, then you do it for a reason. If there is no in-language support for them, then one need to program it. Performance will be equal. Or even, the in-language version will be faster, as the compiler can implement optimized code for endian conversions. About complexity: yes, if it would be a new type, then it's complex. But what if the type is the same? Just like a const qualifier. It could work, without serious complexities. – geza Nov 28 '17 at 18:02
  • 1
    In particular, note that there have been real machines for which 4 byte integers have been neither big nor little endian (PDP/11 mixed endian, and Prime S-mode where a 4 byte integer was 3**1** bits, there may be others) – Martin Bonner supports Monica Nov 28 '17 at 18:10
  • 2
    @MartinBonner: but why would this fact stop C++ to have little/big endian qualifiers? It would help people interacting with these representations. It doesn't mean that we must have all possible integer representations in the language. If you want to interpret a PNG picture on a PDP/11, then you have to make a code for reading big endian numbers. Instead of this, you could just have `big_endian int x;`, and the compiler will make the code for you. – geza Nov 28 '17 at 18:15
  • I think it'd be hard to make a persuasive case for extending the core language to add explicit support for a non-exhaustive group of architecture details. Compare atomics - choosing two of the most popular memory orders and ignoring other genuinely-used ones would undermine the point of standardising them in the first place. – Useless Nov 28 '17 at 22:19
  • Of course that's not a reason why C++ _couldn't_ have these qualifiers, but it's a reason why it isn't a very convincing idea. You can already do everything required in a platform-specific way, so there's no real benefit to standardising unless you genuinely cover the whole problem space. I think you're under-stating the work involved, over-stating the likely benefits, and frankly just being argumentative. – Useless Nov 28 '17 at 22:23
  • At least, the class Reversed could be proposed to be included in the standard library with better integration. This is not as easy as it seems to make a class type behave as a built-in type, there are many pitfalls to avoid (It should be trivial, standard layout, care should be taken so that it is always promoted to a sufficiently sized integer if possible, but still arithmetic operators on two `Reversed` shall produce a `Reversed` etc...). Moreover the function `reverse` must be implemented in assembly to be efficient. And most importantly, how to integrate it with streams (binary/formated?) – Oliv Nov 29 '17 at 11:22
  • I don't get this obsession (admittedly from geza rather than yourself) with building it into the language. Byte order is platform-specific, most platforms already have efficient intrinsics for changing it, it isn't broadly useful without also standardising platform-specific control over structure layout etc. etc. This is a big change adding lots of complexity to the type system for something you can already do. – Useless Nov 29 '17 at 11:27
  • 1
    If C++ retracted their choice to not bother with the representation of values, not only they should define endianness, but also the representation of floating points, signed integers, struct memory layout, ... That's a whole lot! – YSC Nov 29 '17 at 13:15
  • @Useless Standardized library functions exist for a byte order used in multiple standardized wire formats, so there is a precedence for standardizing features to support one specific wire format. So the argument that it isn't an exhaustive list of formats used by CPU architectures doesn't seem like a valid argument to me. There is also precedence for extending the type system with features to control the layout of data structures such as bitfields and packed structs. I am not sure if those are standardized though. – kasperd Nov 29 '17 at 20:24
  • Which functions are those? And which part of the type system controls packing? – Useless Nov 29 '17 at 23:00
4

Integers (as a mathematical concept) have the concept of positive and negative numbers. This abstract concept of sign has a number of different implementations in hardware.

Endianness is not a mathematical concept. Little-endian is a hardware implementation trick to improve the performance of multi-byte twos-complement integer arithmetic on a microprocessor with 16 or 32 bit registers and an 8-bit memory bus. Its creation required using the term big-endian to describe everything else that had the same byte-order in registers and in memory.

The C abstract machine includes the concept of signed and unsigned integers, without details -- without requiring twos-complement arithmetic, 8-bit bytes or how to store a binary number in memory.

PS: I agree that binary data compatibility on the net or in memory/storage is a PIA.

2

That's a good question and I have often thought something like this would be useful. However you need to remember that C aims for platform independence and endianness is only important when a structure like this is converted into some underlying memory layout. This conversion can happen when you cast a uint8_t buffer into an int for example. While an endianness modifier looks neat the programmer still needs to consider other platform differences such as int sizes and structure alignment and packing. For defensive programming when you want find grain control over how some variables or structures are represented in a memory buffer then it is best to code explicit conversion functions and then let the compiler optimiser generate the most efficient code for each supported platform.

D Dowling
  • 159
  • 6
  • That's not a bad answer and provides complementary point of view with mine. But it could be reworked a bit. – YSC Nov 28 '17 at 12:18
  • 1
    _"This conversion can happen when you cast a uint8_t buffer into an int for example."_ Just casting is undefined behaviour due to violating aliasing rules; a `memcpy()` is the only well-defined way to perform that conversion. Then, yes, whether the result is meaningful does depend on the corresponding byte layouts of the source buffer and destination type. – underscore_d Nov 28 '17 at 12:34
2

Short Answer: if it should not be possible to use objects in arithmetic expressions (with no overloaded operators) involving ints, then these objects should not be integer types. And there is no point in allowing addition and multiplication of big-endian and little-endian ints in the same expression.

Longer Answer:

As someone mentioned, endianness is processor-specific. Which really means that this is how numbers are represented when they are used as numbers in the machine language (as addresses and as operands/results of arithmetic operations).

The same is "sort of" true of signage. But not to the same degree. Conversion from language-semantic signage to processor-accepted signage is something that needs to be done to use numbers as numbers. Conversion from big-endian to little-endian and reverse is something that needs to be done to use numbers as data (send them over the network or represent metadata about data sent over the network such as payload lengths).

Having said that, this decision appears to be mostly driven by use cases. The flip side is that there is a good pragmatic reason to ignore certain use cases. The pragmatism arises out of the fact that endianness conversion is more expensive than most arithmetic operations.

If a language had semantics for keeping numbers as little-endian, it would allow developers to shoot themselves in the foot by forcing little-endianness of numbers in a program which does a lot of arithmetic. If developed on a little-endian machine, this enforcing of endianness would be a no-op. But when ported to a big-endian machine, there would a lot of unexpected slowdowns. And if the variables in question were used both for arithmetic and as network data, it would make the code completely non-portable.

Not having these endian semantics or forcing them to be explicitly compiler-specific forces the developers to go through the mental step of thinking of the numbers as being "read" or "written" to/from the network format. This would make the code which converts back and forth between network and host byte order, in the middle of arithmetic operations, cumbersome and less likely to be the preferred way of writing by a lazy developer.

And since development is a human endeavor, making bad choices uncomfortable is a Good Thing(TM).

Edit: here's an example of how this can go badly: Assume that little_endian_int32 and big_endian_int32 types are introduced. Then little_endian_int32(7) % big_endian_int32(5) is a constant expression. What is its result? Do the numbers get implicitly converted to the native format? If not, what is the type of the result? Worse yet, what is the value of the result (which in this case should probably be the same on every machine)?

Again, if multi-byte numbers are used as plain data, then char arrays are just as good. Even if they are "ports" (which are really lookup values into tables or their hashes), they are just sequences of bytes rather than integer types (on which one can do arithmetic).

Now if you limit the allowed arithmetic operations on explicitly-endian numbers to only those operations allowed for pointer types, then you might have a better case for predictability. Then myPort + 5 actually makes sense even if myPort is declared as something like little_endian_int16 on a big endian machine. Same for lastPortInRange - firstPortInRange + 1. If the arithmetic works as it does for pointer types, then this would do what you'd expect, but firstPort * 10000 would be illegal.

Then, of course, you get into the argument of whether the feature bloat is justified by any possible benefit.

Dmitry Rubanovich
  • 2,471
  • 19
  • 27
2

Endianness is not inherently a part of a data type but rather of its storage layout.

As such, it would not be really akin to signed/unsigned but rather more like bit field widths in structs. Similar to those, they could be used for defining binary APIs.

So you'd have something like

int ip : big 32;

which would define both storage layout and integer size, leaving it to the compiler to do the best job of matching use of the field to its access. It's not obvious to me what the allowed declarations should be.

1

From a pragmatic programmer perspective searching Stack Overflow, it's worth noting that the spirit of this question can be answered with a utility library. Boost has such a library:

http://www.boost.org/doc/libs/1_65_1/libs/endian/doc/index.html

The feature of the library most like the language feature under discussion is a set of arithmetic types such as big_int16_t.

Peter
  • 14,559
  • 35
  • 55
0

Because nobody has proposed to add it to the standard, and/or because compiler implementer have never felt a need for it.

Maybe you could propose it to the committee. I do not think it is difficult to implement it in a compiler: compilers already propose fundamental types that are not fundamental types for the target machine.

The development of C++ is an affair of all C++ coders.

@Schimmel. Do not listen to people who justify the status quo! All the cited arguments to justify this absence are more than fragile. A student logician could find their inconsistence without knowing anything about computer science. Just propose it, and just don't care about pathological conservatives. (Advise: propose new types rather than a qualifier because the unsigned and signed keywords are considered mistakes).

Ronny Brendel
  • 4,777
  • 5
  • 35
  • 55
Oliv
  • 17,610
  • 1
  • 29
  • 72
  • 2
    It would be a surprise if such a proposal did succeed. Endianness issues are very rare, and can be easily solved by utility functions. There's no real need for them to be in the language. – geza Nov 28 '17 at 18:23
  • @geza Rejected because poeple think it is unusefull is a good reason to reject it, indeed. This is always the same debate about c++, usefullness of what is provided by the language. C programmer have their opnion about that, C++ coder an other. I think the last group is more focused on productivity. – Oliv Nov 28 '17 at 18:45
  • 1
    The only way to get something into the C standard is to point to a successful implementation in an existing compiler, so the way to start is to get gcc/clang/someone to implement it. Then ping the committee. – pipe Nov 28 '17 at 19:44
  • @pipe, Stroustrup's "SELL" paper advocates for libraries which add semantics to languages as tests of whether new semantic constructs should be included into the language (rather than custom language extensions built into compiler implementations). – Dmitry Rubanovich Nov 29 '17 at 02:24
  • @Oliv, are you sure it hasn't been proposed? I would expect it would get rejected because it would enable a lot of code which would complicate optimization. Any code which has network-byte order integers involved in computations would require either a lot of conversions or optimizer which would remove those conversions. The way types are right now, it is cumbersome to write code which would mix host-byte and net-byte ints in the same computations. Having net-byte ints become basic types would make it too easy. – Dmitry Rubanovich Nov 29 '17 at 02:47
  • @DmitryRubanovich I have no idea if it has been proposed, but if it was rejected does not mean it will be, the context has changed, it seems that the commitee has recognized that networking is crucial now! Types with non fundamental endianess would need to be promoted, this promotion will indeed involve one machine instruction (ROR on x86). But this is already the case for bitfields which are (should be) extracted using BEXTR. – Oliv Nov 29 '17 at 10:43
  • @DmitryRubanovich I have been surprised by optimizer, there are enable to simplify to single instructions, code that is actualy perfoming some common bit manipulation like counting the number of 0 bit as BSF. The conclusion is that, we must always write entire functions in assembly. That is the proof that there are holes in C/C++ languages. – Oliv Nov 29 '17 at 10:55
  • @Oliv, re: "Types with non fundamental endianess would need to be promoted". I assume you mean non-native endiness when you say non-fundamental. Please, correct me if I am wrong about that. Please, see the part of my answer https://stackoverflow.com/a/47543039/1219722 after the "**Edit**". Constant expressions have to be evaluated at compile time. How would the constant expressions issues I brought up be resolved in a portable way? Because if results of constant expressions (in which all data sizes are well-defined rather than impl.-defined) are not portable, this would be a step back. – Dmitry Rubanovich Nov 30 '17 at 07:15
  • @DmitryRubanovich No I meant "fundamental", whatsoever the term "native" is not defined in the c++ standard, the term "fundamental type" is. I have read you answer. The point is to define types whose set of values is the same as the set of values of a fundamental arithmetic type, but with a representation (sequence of unsigned char) which is portable. By portable I mean, portable (communicable) value: I send the representation of a value A as a stream of `unsigned char` to a second machine and this second machine must interpret this sequence to be the value A. – Oliv Nov 30 '17 at 08:20
  • @DmitryRubanovich Look at the `class Reversed` in the "Useless" 's answer and think about what would happen. On one machine, only the value is important. But for communication we must establish a standardized representation, as in real world. – Oliv Nov 30 '17 at 08:30
  • @Oliv, I am aware of the upside. But the question wasn't "what are the advantages of... ?" It was "why isn't it there?" So the natural place to look is the downside. The reversed class in that answer would not produce consistent results for the integer modulo operator `%` for constant expressions (evaluated at compile time). Integer types must produce consistent results when doing arithmetic. Anything sent over the wire is just data. It may as well be bytes. – Dmitry Rubanovich Nov 30 '17 at 09:02
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/160174/discussion-between-oliv-and-dmitry-rubanovich). – Oliv Nov 30 '17 at 09:29
  • _"The conclusion is that, we must always write entire functions in assembly."_ is it sarcasm o0? – YSC Nov 30 '17 at 09:36
  • @YSC Yes I am realy disapointed to have to write any piece of code in assembly! But oftenly I have no other choice. And I have to write much more than one instruction in asm. I am realy frustrated by the compiler not even using 1% of the instructions, or not able to recognize basic patterns that can be simplified to one instructions or a few optimal set. Believing that "compiler will be able to use optimaly the CPU" was a pure utopic thought, we should come back on this ALGOL paradigm and provides to the language a correct plugin of asm to C++. – Oliv Nov 30 '17 at 14:09
  • @Oliv, that's strange, I've been astonished by the quality of GCC optimizations more than once. If the compiler _do_ produce low quality binary, we're f*cked anyway: all of our inference and high level tools (std containers, algorithms, etc.) depend heavily on the ability of the compiler to optimize them away. – YSC Nov 30 '17 at 14:19
  • @YSC, For exemple, the worst trouble with GCC optimization is its inability to transform a switch-case to a table look up . But ok, clang does, and it can be implemented without assembly. Just try to write an algorithm that compute the integral part of log2 of an int: you just need to substract to 31 the number of heaviest bits set to 0. In assembly it is 2 instructions, but neither GCC nor clang produce it. And it is worst if you try to use the fact that an instruction can set the target operand to zero in some case or the zero flag in other case like BSF. – Oliv Nov 30 '17 at 16:46
  • @YSC But for the purpose of this question, both GCC and Clang recognize a rotation when it is expressed in term of shifts and bitwise or. – Oliv Nov 30 '17 at 16:54
-1

Endianness is compiler specific as a result of being machine specific, not as a support mechanism for platform independence. The standard -- is an abstraction that has no regard for imposing rules that make things "easy" -- its task is to create similarity between compilers that allows the programmer to create "platform independence" for their code -- if they choose to do so.

Initially, there was a lot of competition between platforms for market share and also -- compilers were most often written as proprietary tools by microprocessor manufacturers and to support operating systems on specific hardware platforms. Intel was likely not very concerned about writing compilers that supported Motorola microprocessors.

C was -- after all -- invented by Bell Labs to rewrite Unix.

jinzai
  • 436
  • 3
  • 9