Bit masks and long-long

Question

I have a masking procedure (which creates an all-ones bit mask for the bottom half for a given size):

template<class T>
T bottom_half() {
    T halfway = ((sizeof(T) * 8) / 2);
    T mask = (1 << halfway) - 1;

    return mask;
}

which works fine if I call bottom_half<int>() or long or char. But for some reason when I run it with long long, halfway is correctly set to 32, but mask is 0. Why would that be?

The correct spelling of half the bits of an integer is `std::numeric_limits::digits / 2`. The expression used above assumes a `char` has 8 bit which is required (although I'm not aware of any platform where it is not 8 bits). — Dietmar Kühl, Sep 17 '13 at 18:36
@DietmarKühl: I heard Windows CE has byte with 16 bits in it. — Nawaz, Sep 17 '13 at 18:39
@DietmarKühl: I think what you meant to say is that a `char` is *not* required to be 8 bits. (In fact, it's required to be *at least* 8 bits, and it's exactly bits in most implementations.) — Keith Thompson, Sep 17 '13 at 19:00
@Nawaz: I'd be very surprised if Windows CE had `CHAR_BIT==16`. It probably uses 16-bit characters (for UCS-2 and/or UTF-16), but represents them using `wchar_t` or some equivalent. I expect that the predefined type `char` is 8 bits. — Keith Thompson, Sep 17 '13 at 19:01
@Nawaz: yes, `char` is _not_ required to have exactly 8 bits. It has at least 8 bits. The only platform I heart rumored to have more than 8 bit `char`s are some Crays which supposedly had 64 bit `char`s. — Dietmar Kühl, Sep 17 '13 at 19:08
@Nawaz - Windows CE still has 8 bit bytes/`char`s, but the string libraries do not support char by choice, to force the programmer to use `wchar_t`. See http://stackoverflow.com/a/2098298/364818 — Mark Lakata, Sep 17 '13 at 19:22
@DietmarKühl: The Crays I worked on had 8-bit `char` (implemented entirely in software) because they ran Unicos, a version of Unix, which requires 8-bit bytes. I don't know about Cray's earlier non-Unix OS. — Keith Thompson, Sep 17 '13 at 19:24
@MarkLakata: No, not really: according to 18.3.2.4 [numeric.limits.members] paragraph 8: "`static constexpr int digits`; Number of `radix` digits that can be represented without change." 3.9.1 [basic.fundamental] paragraph 7 states "... The representations of integral types shall define values by use of a pure binary numeration system ..." I'd think this amounts to the radix of the built-in integral types being 2. The number of decimal digits is given by `std::numeric_limits::digits10`. — Dietmar Kühl, Sep 17 '13 at 19:50

score 8 · Accepted Answer · edited May 23 '17 at 10:31

8

The left shift is shifting 1, which is int by default and probably 32 bits on your machine. When you shift 1<<32, the result is undefined, which means it is not predictable anymore, as it could be anything.

On some processors, 1<<32 might result in shifting the bit off the high end of the integer and resulting in 0. On other processors, the 32 shift is modulo the register size, so effective it is a zero shift, and the result is 1. In any case, it is undefined.

(See What's bad about shifting a 32-bit variable 32 bits? for a discussion on this).

Note also that sizeof returns units char or "bytes" (these are defined to be the same in C, sizeof(char) == 1 always), but C does not guarantee that a byte is 8 bits. There is standard macro CHAR_BIT to get the bit size of a char.

Try this

#include <limits.h>

template<class T>
T bottom_half() {
    T halfway = ((sizeof(T) * CHAR_BIT) / 2);
    T mask = ((T)1 << halfway) - 1;

    return mask;
}

edited May 23 '17 at 10:31

Community

1
1

answered Sep 17 '13 at 18:27

Mark Lakata

19,989
5
106
123

So you mean `(1 << 32) - 1` is `0`? – Nawaz Sep 17 '13 at 18:30
@dasblinkenlight: How exactly? – Nawaz Sep 17 '13 at 18:31
1

@Nawaz: if `int` has 32 bit or less, the expression `1 << 32` has undefined behavior. – Dietmar Kühl Sep 17 '13 at 18:33
@dasblinkenlight: yes. that is correct, but this answer doesn't talk about UB, which is why I asked this question. – Nawaz Sep 17 '13 at 18:33
@MarkLakata: Yes, your answer is misleading if not wrong. Please correct it if you understand what is wrong with it. – Nawaz Sep 17 '13 at 18:37

score 3 · Answer 2 · answered Sep 17 '13 at 18:31

3

The expression 1 << x has type int. Left-shifting a signed type such that the value exceeds the maximum representable value has undefined behavior. Use T(1) << x instead.

answered Sep 17 '13 at 18:31

Dietmar Kühl

150,225
13
225
380

+1. This is the correct answer so far, as it also talks about *"undefined behavior"* in the original code. – Nawaz Sep 17 '13 at 18:32

score 0 · Answer 3 · answered Sep 17 '13 at 18:28

0

Cast the 1 in your shift operation to the correct type. As it is, it's a simple integer and so the shift achieves nothing -- you shift the bit out of existence.

answered Sep 17 '13 at 18:28

Jongware

22,200
8
54
100

Bit masks and long-long

3 Answers3