Bit-shifting left and discarding bits

Question

Let's consider the function (one of possible implementations of it) which would zero out right N bits of an unsigned short value (or any other unsigned integral type). The possible implementation could look like following:

template<unsigned int shift>
unsigned short zero_right(unsigned short arg) {
  using type = unsigned short;

  constexpr type mask = ~(type(0));
  constexpr type right_zeros = mask << shift; // <-- error here
  return arg & right_zeros;
}

int check() {
  return zero_right<4>(16);
}

With this code, all compilers I have access to complain, in one way or another, about possible overflow. CLang is the most explicit one, with following clear message:

error: implicit conversion from 'int' to 'const type' (aka 'const unsigned short') changes value from 1048560 to 65520 [-Werror,-Wconstant-conversion]

This code looks well defined and clear as day to me, yet when 3 compilers complain, I am becoming very nervous. Am I missing something here? Is there really a chance something fishy is happening?

P.S. While alternative implementations of zeriong out left X bits might be welcome and interesting, the primary focus of this question is of validity of code as posted.

@TavianBarnes, they can't be promoted to signed ints for unsigned arguments. — SergeyA, Apr 28 '16 at 21:01
Not what you are asking, but something you may want to be aware of (and don't guard against) is that if you left shift an unsigned integer by 'n' bits where 'n' is >= the number of bits in the type you shift, then that's Undefined Behaviour. — Jesper Juhl, Apr 28 '16 at 21:06
@SergeyA But that is still the problem: thew result of `<<` is int, not short. Converting the result of `mask << shift` back to `type` before assignment makes the error go away. — Mr Lister, Apr 28 '16 at 21:07
@JesperJuhl, I have this covered, just didn't want to overcumber the question. — SergeyA, Apr 28 '16 at 21:07
@MrLister, I'd rather not 'make the error go away' with implicit cast. In my view, this code is either right or wrong without the cast :) — SergeyA, Apr 28 '16 at 21:08
@SergeyA Turns out I was right in the first place. In `mask << shift`, mask is promoted to `signed int` and the type of `shift` doesn't affect that. — Tavian Barnes, Apr 28 '16 at 21:13
@TavianBarnes, from cppreference: `For unsigned and positive a, the value of a << b is the value of a * 2b, reduced modulo maximum value of the return type plus 1...` - doesn't say it is promoted to signed. — SergeyA, Apr 28 '16 at 21:15
@SergeyA cppreference also says "The return type is the type of the left operand **after integral promotions.**" (emphasis mine) — Mr Lister, Apr 28 '16 at 21:17

score 3 · Answer 1 · edited Jun 20 '20 at 09:12

3

From the C++11 Standard:

5.8 Shift operators [expr.shift]

1 ...

The operands shall be of integral or unscoped enumeration type and integral promotions are performed. The type of the result is that of the promoted left operand.

The expression

mask << shift;

is evaluated after integral promotion is applied to mask. Hence, it evaluates to 1048560 if sizeof(unsigned short) is 2, which explains the message from clang.

One way to avoid the overflow problem is to right shift first before performing a left shift, and move that to a function of its own.

template <typename T, unsigned int shift>
constexpr T right_zero_bits()
{
   // ~(T(0)) performs integral promotion, if needed
   // T(~(T(0))) truncates the number to T, if needed.
   return (T(~(T(0))) >> shift ) << shift;
}

template<unsigned int shift>
unsigned short zero_right(unsigned short arg) {
   return arg & right_zero_bits<unsigned short, shift>();
}

edited Jun 20 '20 at 09:12

Community

1
1

answered Apr 28 '16 at 21:15

R Sahu

204,454
14
159
270

Yes, it explains the message, but doesn't explain the warning :) May be I am not clear enough with my question. The question is: should I be worried? :) – SergeyA Apr 28 '16 at 21:18
@SergeyA The warning is the same one you would get with `unsigned short mask = 1048560;`, i.e. no you shouldn't be worried, but you should probably suppress it with an explicit cast. – Tavian Barnes Apr 28 '16 at 21:19
@TavianBarnes, I'd love to belive you - but to be a proper LanguageLawyer answer it needs something to substantiate the claim :) – SergeyA Apr 28 '16 at 21:21
@SergeyA Haha that's why I'm commenting, not answering :). But there's a sourced answer here that corroborates me: http://stackoverflow.com/a/6752688/502399 – Tavian Barnes Apr 28 '16 at 21:23
The first one causes UB for `unsigned short`: `~T(0)` is `int `because of integer promotions, it has negative value; `>>` will probably left, fill and then `<<` causes UB (left-shift of signed int) – M.M Apr 28 '16 at 22:09
@M.M, you're right. I think the updated answer is right. – R Sahu Apr 28 '16 at 22:22
Updated still shifts `1` into the sign bit, although this is implementation-defined since C++14 and is likely to work – M.M Apr 28 '16 at 22:31

score 3 · Answer 2 · 2016-04-28T22:03:14.727

Yes, as you suspect, even after suppressing the compiler diagnostics, your code is strictly speaking not fully portable because of the promotion from unsigned short to signed int, bit arithmetic being done in signed int, and then signed int being converted back to unsigned short. You've managed to avoid undefined behaviour (I think, after a quick look), but the result is not guaranteed to be what you are hoping for. (type)~(type)0 is not required to correspond to "all bits one" in type type; it's already iffy before the shift.

To get something fully portable, simply make sure you do all your arithmetic in at least unsigned int (wider types if necessary, but never narrower). Then there won't be any promotions to signed types to worry about.

template<unsigned int shift>
unsigned short zero_right(unsigned short arg) {
  using type = unsigned short;

  constexpr auto mask = ~(type(0) + 0U);
  constexpr auto right_zeros = mask << shift;
  return arg & right_zeros;
}

int check() {
  return zero_right<4>(16);
}

Craig Estey · Answer 3 · 2016-04-28T21:51:47.180

I don't know if this is exactly what you want, but it compiles:

template<unsigned int shift>
unsigned short zero_right(unsigned short arg) {
  using type = unsigned short;

  //constexpr type mask = ~(type(0));
  type right_zeros = ~(type(0));
  right_zeros <<= shift;
  return arg & right_zeros;
}

int check() {
  return zero_right<4>(16);
}

UPDATE:

Seems like you simply hushed the compiler by making sure it has no idea what is going on with the types.

No

First you get right_zeros with value FFFF (from ~0). Normally, ~0 is FFFFFFFFFFFFFF... but because you're using u16, you get FFFF.

Then, shift by 4 produces FFFF0 [calculation is extended to 32 bits], but when stored back, only the rightmost 16 bits remain, so the value is FFF0

This is perfectly legal and defined behavior and you're taking advantage of the truncation. The compiler is not "being fooled". Actually, it works fine with or without truncation.

You could make right_zeros into u32 or u64 if you wished, but then you'd need to add right_zeros &= 0xFFFF

If there is an undefined behavior (the very essence of my question!) you simply made it undetectable.

There is no UB based on the totality of your code, no matter what the compiler says.

Actually, Tavian got it. Use an explicit cast:

constexpr type right_zeros = (type) (mask << shift); // now clean

This is telling the compiler, amongst other things, that you want the truncation to 16 bits.

If there were UB, then the compiler should still complain.

Seems like you simply hushed the compiler by making sure it has no idea what is going on with the types. If there is an undefined behavior (the very essence of my question!) you simply made it undetectable. — SergeyA, Apr 28 '16 at 21:19
"If there were UB, then the compiler should still complain." - don't count on it — M.M, Apr 28 '16 at 22:38

M.M · Accepted Answer · 2016-04-28T22:53:06.423

The message seems pretty plain:

error: implicit conversion from 'int' to 'const type' (aka 'const unsigned short') changes value from 1048560 to 65520 [-Werror,-Wconstant-conversion]

mask << shift has value 1048560 (arising from 65535 << 4), and you assign it to unsigned short, which is defined to adjust the value mod 65536, giving 65520.

This last conversion is well-defined. The error message is because you passed compiler flags -Werror,-Wconstant-conversion requesting to get an error message in this situation anyway. If you don't want this error then don't pass those flags.

Although this particular usage was well-defined, there could be undefined behaviour for some inputs (namely, shift being 16 or greater, if you are on a 32-bit int system). So you should fix the function.

To fix the function you need to be more careful in the unsigned short case, because of the supremely annoying rule about integer promotion of unsigned short to signed int.

Here's one solution a bit different from the other offerings.. avoid the shift issue entirely, works for any shift size:

template<unsigned int shift, typename T>
constexpr T zero_right(T arg)
{
    T mask = -1;
    for (int s = shift; s--; ) mask *= 2u;
    return mask & arg;
}

// Demo
auto f() { return zero_right<15>((unsigned short)65535); }  //  mov eax, 32768

That's interesting. I realize I can't shift for more bits then there are in the type. In my real application, this is not going to happen. Other than that, you are saying code is well-defined and will always do what I expect it to do? — SergeyA, Apr 28 '16 at 22:18
The way you have it now, relies on 2's complement, and the unsigned short case is implementation-defined for shifting by 15 if you have 32-bit ints — M.M, Apr 28 '16 at 22:25
@SergeyA, you can try shift more bits than width, but Intel clearly says they mask out the high bits of the *shift* operand. E.g., for uint32_t, the shift is `s % 32`, so `int32_t << 40` === `int32_t << 8`. **But**, beware of compilers - if gcc sees shift > 32 at (optimized) compile time it'd just zero out the result! — BitWhistler, Apr 29 '16 at 00:46

Bit-shifting left and discarding bits

4 Answers4

Linked