Signed extension from 24 bit to 32 bit in C++

Question

I have 3 unsigned bytes that are coming over the wire separately.

[byte1, byte2, byte3]

I need to convert these to a signed 32-bit value but I am not quite sure how to handle the sign of the negative values.

I thought of copying the bytes to the upper 3 bytes in the int32 and then shifting everything to the right but I read this may have unexpected behavior.

Is there an easier way to handle this?

The representation is using two's complement.

harold · Accepted Answer · 2017-03-01T16:01:43.077

11

You could use:

uint32_t sign_extend_24_32(uint32_t x) {
    const int bits = 24;
    uint32_t m = 1u << (bits - 1);
    return (x ^ m) - m;
}

This works because:

if the old sign was 1, then the XOR makes it zero and the subtraction will set it and borrow through all higher bits, setting them as well.
if the old sign was 0, the XOR will set it, the subtract resets it again and doesn't borrow so the upper bits stay 0.

Templated version

template<class T>
T sign_extend(T x, const int bits) {
    T m = 1;
    m <<= bits - 1;
    return (x ^ m) - m;
}

edited Mar 01 '17 at 16:01

answered Mar 01 '17 at 15:46

harold

61,398
6
86
164

1

Another benefit of bit-twiddling in this way is that you're not limited to a 32-bit int - it works just as well on a 64-bit int for example. I'd change the type, perhaps to a template parameter, and make `bits` a function parameter as well. – Mark Ransom Mar 01 '17 at 15:54
@MarkRansom good points, is that approximately what you meant? – harold Mar 01 '17 at 16:02
I need a signed 32 not unsigned though – Beto Mar 01 '17 at 16:04
@Beto you can just use signed types here, at least I see no way for it to break (unless `bits` is something unreasonable). Makes the rest of the code more dangerous though. – harold Mar 01 '17 at 16:14
1

Perfect. I like the way you split `m` assignment into two parts to make sure the shifting occurs on the proper type. – Mark Ransom Mar 01 '17 at 16:37

score 2 · Answer 2 · answered Mar 01 '17 at 14:49

Assuming both representations are two's complement, simply

upper_byte = (Signed_byte(incoming_msb) >= 0? 0 : Byte(-1));

where

using Signed_byte = signed char;
using Byte = unsigned char;

and upper_byte is a variable representing the missing fourth byte.

The conversion to Signed_byte is formally implementation-dependent, but a two's complement implementation doesn't have a choice, really.

score 1 · Answer 3 · answered Mar 01 '17 at 15:11

You could let the compiler process itself the sign extension. Assuming that the lowest significant byte is byte1 and the high significant byte is byte3;

int val = (signed char) byte3;                // C guarantees the sign extension
val << 16;                                    // shift the byte at its definitive place
val |= ((int) (unsigned char) byte2) << 8;    // place the second byte
val |= ((int) (unsigned char) byte1;          // and the least significant one

I have used C style cast here when static_cast would have been more C++ish, but as an old dinosaur (and Java programmer) I find C style cast more readable for integer conversions.

thanks, worked fine! val <<= 16, not val << 16 – BjornW Jan 03 '22 at 14:55 — BjornW, Jan 03 '22 at 14:55

score 1 · Answer 4 · answered Apr 29 '19 at 23:45

This is a pretty old question, but I recently had to do the same (while dealing with 24-bit audio samples), and wrote my own solution for it. It's using a similar principle as this answer, but more generic, and potentially generates better code after compiling.

template <size_t Bits, typename T>
inline constexpr T sign_extend(const T& v) noexcept {
    static_assert(std::is_integral<T>::value, "T is not integral");
    static_assert((sizeof(T) * 8u) >= Bits, "T is smaller than the specified width");
    if constexpr ((sizeof(T) * 8u) == Bits) return v;
    else {
        using S = struct { signed Val : Bits; };
        return reinterpret_cast<const S*>(&v)->Val;
    }
}

This has no hard-coded math, it simply lets the compiler do the work and figure out the best way to sign-extend the number. With certain widths, this can even generate a native sign-extension instruction in the assembly, such as MOVSX on x86.

This function assumes you copied your N-bit number into the lower N bits of the type you want to extend it to. So for example:

int16_t a = -42;
int32_t b{};
memcpy(&b, &a, sizeof(a));
b = sign_extend<16>(b);

Of course it works for any number of bits, extending it to the full width of the type that contained the data.

phuclv · Answer 5 · 2019-03-20T04:39:33.163

0

You can use a bitfield

template<size_t L>
inline int32_t sign_extend_to_32(const char *x)
{
  struct {int32_t i: L;} s;
  memcpy(&s, x, 3);
  return s.i;
  // or
  return s.i = (x[2] << 16) | (x[1] << 8) | x[0]; // assume little endian
}

Easy and no undefined behavior invoked

int32_t r = sign_extend_to_32<24>(your_3byte_array);

Of course copying the bytes to the upper 3 bytes in the int32 and then shifting everything to the right as you thought is also a good idea. There's no undefined behavior if you use memcpy like above. An alternative is reinterpret_cast in C++ and union in C, which can avoid the use of memcpy. However there's an implementation defined behavior because right shift is not always a sign-extension shift (although almost all modern compilers do that)

edited Mar 20 '19 at 04:39

answered Mar 01 '17 at 15:05

phuclv

37,963
15
156
475

Placing a value in a bit field so small that the extracted value is not equal, must surely be implementation-defined behavior. Still I like this. :) – Cheers and hth. - Alf Mar 01 '17 at 15:09
How do you compile this? I get some "error: address of bit-field requested". Works if I remove that `.i24` in the memcpy, maybe that's what you meant? – harold Mar 01 '17 at 15:34
@harold yes. This was made up without compiling – phuclv Mar 01 '17 at 15:43

score 0 · Answer 6 · answered Mar 01 '17 at 15:24

0

Here's a method that works for any bit count, even if it's not a multiple of 8. This assumes you've already assembled the 3 bytes into an integer value.

const int bits = 24;
int mask = (1 << bits) - 1;
bool is_negative = (value & ~(mask >> 1)) != 0;
value |= -is_negative & ~mask;

answered Mar 01 '17 at 15:24

Mark Ransom

299,747
42
398
622

Why so complicated though? You could just `(value ^ m) - m` with `m = 1 << (bits - 1)` – harold Mar 01 '17 at 15:27
@harold if you think you have a better answer go ahead and answer the question yourself. I'm having a hard time convincing myself that it works, but if it does you'll get a +1 from me. – Mark Ransom Mar 01 '17 at 15:40
Fair enough, I just thought maybe there's a reason for it – harold Mar 01 '17 at 15:47

score 0 · Answer 7 · answered Dec 07 '21 at 07:17

0

Assuming your 24bit value is stored in variable int32_t val, you can easily extend the sign by following:

val = (val << 8) >> 8;

answered Dec 07 '21 at 07:17

anicic

1
1

Signed extension from 24 bit to 32 bit in C++

7 Answers7

Linked

Related