34

In this answer, zwol made this claim:

The correct way to convert two bytes of data from an external source into a 16-bit signed integer is with helper functions like this:

#include <stdint.h>

int16_t be16_to_cpu_signed(const uint8_t data[static 2]) {
    uint32_t val = (((uint32_t)data[0]) << 8) | 
                   (((uint32_t)data[1]) << 0);
    return ((int32_t) val) - 0x10000u;
}

int16_t le16_to_cpu_signed(const uint8_t data[static 2]) {
    uint32_t val = (((uint32_t)data[0]) << 0) | 
                   (((uint32_t)data[1]) << 8);
    return ((int32_t) val) - 0x10000u;
}

Which of the above functions is appropriate depends on whether the array contains a little endian or a big endian representation. Endianness is not the issue at question here, I am wondering why zwol subtracts 0x10000u from the uint32_t value converted to int32_t.

Why is this the correct way?

How does it avoid the implementation defined behavior when converting to the return type?

Since you can assume 2's complement representation, how would this simpler cast fail: return (uint16_t)val;

What is wrong with this naive solution:

int16_t le16_to_cpu_signed(const uint8_t data[static 2]) {
    return (uint16_t)data[0] | ((uint16_t)data[1] << 8);
}
chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • The exact behavior when casting to `int16_t` is implementation-defined, so the naive approach isn't portable. – nwellnhof Mar 26 '20 at 09:56
  • @nwellnhof there is no cast to `int16_t` – M.M Mar 26 '20 at 09:57
  • The question in the title can't be answered without specifying which mapping to use – M.M Mar 26 '20 at 10:17
  • 4
    Both approaches rely on implementation defined behavior (converting an unsigned value to a signed type that can't represent the value). Eg. in the first approach, `0xFFFF0001u` can't be represented as `int16_t`, and in the second approach `0xFFFFu` can't be represented as `int16_t`. – Sander De Dycker Mar 26 '20 at 10:25
  • 1
    "Since you can assume 2's complement representation" [citation needed]. C89 and C99 certainly did not deny 1s complement and sign-magnitude representations. Q.v., https://stackoverflow.com/questions/12276957/are-there-any-non-twos-complement-implementations-of-c – Eric Towers Mar 26 '20 at 21:55
  • @EricTowers: `int16_t` and `int32_t` are mandated to use 2's complement representation without padding bits. We are in known territory :) – chqrlie Mar 26 '20 at 23:10
  • Specifying the behavior of conversions between `intxx_t` and `uintxx_t` would be a welcome improvement. I cannot imagine a downside for this. – chqrlie Mar 26 '20 at 23:12
  • Could use a table lookup `int16_t table[256][256];` – stark Mar 30 '20 at 15:04
  • Why are you shifting by zero? Just a regular cast will do. – Leslie Satenstein Apr 01 '20 at 01:43
  • @LeslieSatenstein: I assume zwol added a dummy shift by zero to underscore the symmetry of the operation. It is useless but harmless and does not cause the compiler to generate any code (except maybe with optimisations disabled). – chqrlie Apr 01 '20 at 07:13

7 Answers7

20

If int is 16-bit then your version relies on implementation-defined behaviour if the value of the expression in the return statement is out of range for int16_t.

However the first version also has a similar problem; for example if int32_t is a typedef for int, and the input bytes are both 0xFF, then the result of the subtraction in the return statement is UINT_MAX which causes implementation-defined behaviour when converted to int16_t.

IMHO the answer you link to has several major issues .

M.M
  • 138,810
  • 21
  • 208
  • 365
  • @idmean the question needs clarification before that can be answered, I have requested in a comment under the question but OP hasn't responded – M.M Mar 27 '20 at 08:04
  • 1
    @M.M: I edited the question the specify that endianness is not the issue. IMHO the issue zwol is trying to solve is the implementation defined behavior when converting to the destination type, but I agree with you: I believe he is mistaken as his method has other problems. How would you solve the implementation defined behavior efficiently? – chqrlie Mar 28 '20 at 12:32
  • @chqrlieforyellowblockquotes I wasn't referring to endianness specifically. Do you just want to put the exact bits of the two input octets into the `int16_t` ? – M.M Mar 28 '20 at 13:19
  • @M.M: yes, that's exactly the question. I wrote *bytes* but the correct word should indeed be *octets* as the type is `uchar8_t`. – chqrlie Mar 28 '20 at 13:35
  • @chqrlieforyellowblockquotes OK, then use one of the solutions that just writes the two bytes (e.g. memcpy, or assignment to `uint8_t`) – M.M Mar 28 '20 at 22:26
9

This should be pedantically correct and work also on platforms that use sign bit or 1's complement representations, instead of the usual 2's complement. The input bytes are assumed to be in 2's complement.

int le16_to_cpu_signed(const uint8_t data[static 2]) {
    unsigned value = data[0] | ((unsigned)data[1] << 8);
    if (value & 0x8000)
        return -(int)(~value) - 1;
    else
        return value;
}

Because of the branch, it will be more expensive than other options.

What this accomplishes is that it avoids any assumption on how int representation relates to unsigned representation on the platform. The cast to int is required to preserve arithmetic value for any number that will fit in target type. Because the inversion ensures top bit of 16-bit number will be zero, the value will fit. Then the unary - and subtraction of 1 apply the usual rule for 2's complement negation. Depending on platform, INT16_MIN could still overflow if it doesn't fit in the int type on the target, in which case long should be used.

The difference to the original version in the question comes at the return time. While the original just always subtracted 0x10000 and 2's complement let signed overflow wrap it to int16_t range, this version has the explicit if that avoids signed wrapover (which is undefined).

Now in practice, almost all platforms in use today use 2's complement representation. In fact, if the platform has standard-compliant stdint.h that defines int32_t, it must use 2's complement for it. Where this approach sometimes comes handy is with some scripting languages that don't have integer data types at all - you can modify the operations shown above for floats and it will give the correct result.

jpa
  • 10,351
  • 1
  • 28
  • 45
  • The C Standard specifically mandates that `int16_t` and any `intxx_t` and their unsigned variants must use 2's complement representation without padding bits. It would take a purposely perverse architecture to host these types and use another representation for `int`, but I guess the DS9K could be configured this way. – chqrlie Mar 27 '20 at 07:24
  • @chqrlieforyellowblockquotes Good point, I changed to use `int` to avoid the confusion. Indeed if the platform defines `int32_t` it must be 2's complement. – jpa Mar 27 '20 at 07:30
  • These types were standardized in C99 this way: **C99 7.18.1.1 Exact-width integer types** *The typedef name `intN_t `designates a signed integer type with width `N`, no padding bits, and a two’s complement representation. Thus, `int8_t` denotes a signed integer type with a width of exactly 8 bits.* Other representations are still supported by the standard, but for other integer types. – chqrlie Mar 27 '20 at 07:30
  • With your updated version, `(int)value` has implementation defined behavior if type `int` has just 16 bits. I'm afraid you need to use `(long)value - 0x10000`, but on non 2's complement architectures, the value `0x8000 - 0x10000` cannot be represented as a 16-bit `int`, so the problem stays. – chqrlie Mar 27 '20 at 07:32
  • @chqrlieforyellowblockquotes Yeah, just noticed the same, I fixed with ~ instead, but `long` would work equally well. – jpa Mar 27 '20 at 07:34
  • I think it's high time to deprecate non 2's complement integer representations from the C Standard. Who opposes that? – chqrlie Mar 27 '20 at 07:36
  • Deprecating it wouldn't change behavior of processors that support nothing else, and no-one in their right mind would use one's complement for anything new anyway. – jpa Mar 27 '20 at 07:38
  • Deprecating it from new versions of the Standard would not prevent maintainers of antique systems from using older compilers, which they all do already. – chqrlie Mar 27 '20 at 07:44
  • Your code should work for non 2's complement architectures, except for the undefined behavior when subtracting 1 from `INT_MIN` if `INT_MIN` is defined as `(-32767)`. Note however that such a platform might not define the type `uint8_t`, because `unsigned char` may have more than 8 bits (but cannot have padding bits). To improve portability, you could use `unsigned char[]` for the source and `long` for the cast and return value, which would be inefficient for most architectures :( – chqrlie Mar 28 '20 at 12:08
6

The arithmetic operators shift and bitwise-or in expression (uint16_t)data[0] | ((uint16_t)data[1] << 8) don't work on types smaller than int, so that those uint16_t values get promoted to int (or unsigned if sizeof(uint16_t) == sizeof(int)). Still though, that should yield the correct answer, since only the lower 2 bytes contain the value.

Another pedantically correct version for big-endian to little-endian conversion (assuming little-endian CPU) is:

#include <string.h>
#include <stdint.h>

int16_t be16_to_cpu_signed(const uint8_t data[2]) {
    int16_t r;
    memcpy(&r, data, sizeof r);
    return __builtin_bswap16(r);
}

memcpy is used to copy the representation of int16_t and that is the standard-compliant way to do so. This version also compiles into 1 instruction movbe, see assembly.

Maxim Egorushkin
  • 131,725
  • 17
  • 180
  • 271
  • 1
    @M.M One reason `__builtin_bswap16` exists is because byte-swapping in ISO C cannot be implemented as efficiently. – Maxim Egorushkin Mar 26 '20 at 10:21
  • 1
    Not true; the compiler could detect that the code implements byte swapping and translate it as an efficient builtin – M.M Mar 26 '20 at 10:24
  • @M.M I am willing to agree with that if you demonstrate an example. – Maxim Egorushkin Mar 26 '20 at 10:24
  • The built-in function is declared as `uint16_t __builtin_bswap16 (uint16_t x)` so the implicit conversion from `uint16_t` to signed `int16_t` still poses the same problem. – chqrlie Mar 26 '20 at 10:41
  • @chqrlieforyellowblockquotes What exactly is the problem in `int16_t` -> `uint16_t` -> `int16_t`? IIRC, this conversion is implementation-defined. – Maxim Egorushkin Mar 26 '20 at 10:43
  • 1
    Converting `int16_t` to `uint16_t` is well defined: negative values convert to values greater than `INT_MAX`, but converting these values back to `uint16_t` is implementation defined behavior: **6.3.1.3 Signed and unsigned integers** *1. When a value with integer type is converted to another integer type other than_Bool, if the value can be represented by the new type, it is unchanged.* ... *3. Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.* – chqrlie Mar 26 '20 at 10:49
  • @MaximEgorushkin M.M is correct, newer (gcc >= 5 and clang >= 5) compilers are capable of pattern matching `|`/`<<`-based code into byteswaps https://stackoverflow.com/a/60337532/1084774 – Petr Skocik Mar 26 '20 at 11:15
  • @PSkocik Nope, none of those examples is as efficient as `movbe`. The examples in that link are `mov` followed by `bswap` instruction. And they also rely on `ntohl` which isn't ansi C. However, the compiler does recognise sequence `memcpy` followed by `__builtin_bswap` and generates most efficient code. – Maxim Egorushkin Mar 26 '20 at 11:18
  • 1
    @MaximEgorushkin gcc doesn't seem to do so good in the 16-bit version, but clang generates the same code for `ntohs`/`__builtin_bswap` and the `|`/`<<` pattern: https://gcc.godbolt.org/z/rJ-j87 – Petr Skocik Mar 26 '20 at 11:32
  • @PSkocik clang generates much better code with `movbe` in ansi C, I agree. gcc is struggling and needs `ntohs` or `__builtin_bswap16`. – Maxim Egorushkin Mar 26 '20 at 11:38
  • 1
    So far as I can tell, clang and gcc can only generate efficient code from sequences of shifts when targeting platforms which support unaligned loads and stores. When using popular embedded platforms like the Cortex-M0, when accessing storage which the programmer knows to be aligned, using the `-fno-strict-aliasing` extension is more efficient than any other technique I've seen. – supercat Mar 26 '20 at 19:18
  • 3
    @M.M: I think Maxim is saying "can't *in practice* with current compilers". Of course a compiler could not suck for once and recognize loading contiguous bytes into an integer. GCC7 or 8 did finally re-introduce load/store coalescing for cases where byte-reverse *isn't* needed, after GCC3 dropped it decades ago. But in general compilers tend to need help in practice with a lot of stuff that CPUs can do efficiently but which ISO C neglected / refused to portably expose. Portable ISO C is not a good language for efficient code bit / byte-manipulation. – Peter Cordes Mar 26 '20 at 20:35
  • Both the `memcpy` version and the explicit `return (unsigned)data[1] | ((unsigned)data[0] << 8);` compile to a single instruction: https://gcc.godbolt.org/z/kaprqk , but neither avoids the implementation defined behavior when converting the `uint16_t` (resp `unsigned`) to the return type `int16_t` for values greater than `0x7fff`. – chqrlie Mar 29 '20 at 09:38
6

Another method - using union:

union B2I16
{
   int16_t i;
   byte    b[2];
};

In program:

...
B2I16 conv;

conv.b[0] = first_byte;
conv.b[1] = second_byte;
int16_t result = conv.i;

first_byte and second_byte can be swapped according to little or big endian model. This method is not better but is one of alternatives.

i486
  • 6,491
  • 4
  • 24
  • 41
  • 2
    Isn't union type punning [unspecified behaviour](https://en.wikipedia.org/wiki/Type_punning#Use_of_union)? – Maxim Egorushkin Mar 26 '20 at 10:31
  • 2
    @MaximEgorushkin: Wikipedia is not an authoritative source for interpreting the C standard. – Eric Postpischil Mar 26 '20 at 11:46
  • 2
    @EricPostpischil Focusing on the messenger rather than the message is unwise. – Maxim Egorushkin Mar 26 '20 at 13:57
  • @Maxim: The text in that Wikipedia passage is not a message conveyed verbatim from an authoritative source. It is somebody’s opinion and rewording. – Eric Postpischil Mar 26 '20 at 14:01
  • @MaximEgorushkin: No, it isn't UB. C99 defines the behaviour of union type punning, unlike ISO C++ which still doesn't as of C++17. But ISO C doesn't strictly define struct/union layout, or guarantee that a `byte` type has 8 bits. But an implementation that has `int16_t` at all (2's complement exactly 16 bits, no padding) probably isn't weird. IDK if we care about word-addressable machines where `char` is a 16-bit type - the question might not even make sense there unless something unpacked bytes into one per `uint_least8_t` or something. – Peter Cordes Mar 26 '20 at 20:10
  • @PeterCordes You are quite right. However, I was taking about _unspecified_ rather than _undefined_ behaviour. As in (from most portable to least): defined/required, implementation-defined, unspecified, undefined. – Maxim Egorushkin Mar 26 '20 at 20:15
  • 1
    @MaximEgorushkin: oh yes, oops I misread your comment. Assuming `byte[2]` and `int16_t` are the same size, it is one or the other of the two possible orderings, not some arbitrary shuffled bitwise place values. So you can at least detect at compile time what endianness the implementation has. – Peter Cordes Mar 26 '20 at 20:29
  • @PeterCordes Right, and as I use C and C++ for performance only, I insist on the best possible assembly (on the platforms I target for): `mov` or `movbe`. – Maxim Egorushkin Mar 26 '20 at 21:00
  • @MaximEgorushkin: I think you meant to reply to my other comment, under your answer. But yes, agreed. – Peter Cordes Mar 26 '20 at 21:02
  • 1
    The standard clearly states that the value of the union member is the result of interpreting the stored bits in the member as a value representation of that type . There are implementation-defined aspects insofaras the representation of types is implementation-defined. – M.M Mar 26 '20 at 22:42
4

Here is another version that relies only on portable and well-defined behaviours (header #include <endian.h> is not standard, the code is):

#include <endian.h>
#include <stdint.h>
#include <string.h>

static inline void swap(uint8_t* a, uint8_t* b) {
    uint8_t t = *a;
    *a = *b;
    *b = t;
}
static inline void reverse(uint8_t* data, int data_len) {
    for(int i = 0, j = data_len / 2; i < j; ++i)
        swap(data + i, data + data_len - 1 - i);
}

int16_t be16_to_cpu_signed(const uint8_t data[2]) {
    int16_t r;
#if __BYTE_ORDER == __LITTLE_ENDIAN
    uint8_t data2[sizeof r];
    memcpy(data2, data, sizeof data2);
    reverse(data2, sizeof data2);
    memcpy(&r, data2, sizeof r);
#else
    memcpy(&r, data, sizeof r);
#endif
    return r;
}

The little-endian version compiles to single movbe instruction with clang, gcc version is less optimal, see assembly.

Maxim Egorushkin
  • 131,725
  • 17
  • 180
  • 271
  • @chqrlieforyellowblockquotes Your main concern seems to have been `uint16_t` to `int16_t` conversion, this version doesn't have that conversion, so here you go. – Maxim Egorushkin Mar 30 '20 at 15:15
2

I want to thank all contributors for theirs answers. Here is what the collective works boils down to:

  1. As per the C Standard 7.20.1.1 Exact-width integer types: types uint8_t, int16_t and uint16_t must use two's complement representation without any padding bits, so the actual bits of the representation are unambiguously those of the 2 bytes in the array, in the order specified by the function names.
  2. computing the unsigned 16 bit value with (unsigned)data[0] | ((unsigned)data[1] << 8) (for the little endian version) compiles to a single instruction and yields an unsigned 16-bit value.
  3. As per the C Standard 6.3.1.3 Signed and unsigned integers: converting a value of type uint16_t to signed type int16_t has implementation defined behavior if the value is not in the range of the destination type. No special provision is made for types whose representation is precisely defined.
  4. to avoid this implementation defined behavior, one can test if the unsigned value is larger than INT_MAX and compute the corresponding signed value by subtracting 0x10000. Doing this for all values as suggested by zwol may produce values outside the range of int16_t with the same implementation defined behavior.
  5. testing for the 0x8000 bit explicitly causes the compilers to produce inefficient code.
  6. a more efficient conversion without implementation defined behavior uses type punning via a union, but the debate regarding the definedness of this approach is still open, even at the C Standard's Committee level.
  7. type punning can be performed portably and with defined behavior using memcpy.

Combining points 2 and 7, here is a portable and fully defined solution that compiles efficiently to a single instruction with both gcc and clang:

#include <stdint.h>
#include <string.h>

int16_t be16_to_cpu_signed(const uint8_t data[2]) {
    int16_t r;
    uint16_t u = (unsigned)data[1] | ((unsigned)data[0] << 8);
    memcpy(&r, &u, sizeof r);
    return r;
}

int16_t le16_to_cpu_signed(const uint8_t data[2]) {
    int16_t r;
    uint16_t u = (unsigned)data[0] | ((unsigned)data[1] << 8);
    memcpy(&r, &u, sizeof r);
    return r;
}

64-bit Assembly:

be16_to_cpu_signed(unsigned char const*):
        movbe   ax, WORD PTR [rdi]
        ret
le16_to_cpu_signed(unsigned char const*):
        movzx   eax, WORD PTR [rdi]
        ret
chqrlie
  • 131,814
  • 10
  • 121
  • 189
  • I am not a language lawyer, but only `char` types can alias or contain the object representation of any other type. `uint16_t` isn't one of `char` types, so that `memcpy` of `uint16_t` to `int16_t` is not well-defined behaviour. The standard only requires `char[sizeof(T)] -> T > char[sizeof(T)]` conversion with `memcpy` to be well defined. – Maxim Egorushkin Apr 01 '20 at 03:24
  • `memcpy` of `uint16_t` to `int16_t` is implementation-defined at best, not portable, not well-defined, exactly as assignment of one to the other, and you cannot magically circumvent that with `memcpy`. It doesn't matter whether `uint16_t` uses two's complement representation or not, or padding bits are present or not - that isn't behaviour defined or required by C standard. – Maxim Egorushkin Apr 01 '20 at 03:43
  • With so many words, your "solution" boils down to replacing `r = u` to `memcpy(&r, &u, sizeof u)` but the latter is no better than the former, is it? – Maxim Egorushkin Apr 01 '20 at 03:53
0

Why not just use your "naive solution," but cast each element to int16_t instead of uint16_t?

int16_t le16_to_cpu_signed(const uint8_t data[static 2]) {
    return (int16_t)data[0] | ((int16_t)data[1] << 8);
}

Then you would not have to deal with casting unsigned ints to signed ints (and possibly being out of the signed int range).

Tyler Streeter
  • 1,214
  • 12
  • 14