What is the most portable way to read and write the highest bit of an integer in C?

Question

This is a Bloomberg interview question. I didn’t give best answer at that time. Can anyone please answer it?

GNU C is not very portable... – osvein Jan 22 '18 at 16:51 — osvein, Jan 22 '18 at 16:51

score 6 · Accepted Answer · edited May 23 '17 at 12:33

6

If the type is unsigned, it's easy:

(type)-1-(type)-1/2

For signed values, I know no way. If you find a way, it would answer several unanswered questions on SO:

C question: off_t (and other signed integer types) minimum and maximum values

Is there any way to compute the width of an integer type at compile-time?

Maybe others.

edited May 23 '17 at 12:33

Community

1
1

answered Jan 26 '11 at 03:51

R.. GitHub STOP HELPING ICE

208,859
35
376
711

what do you mean (type) -1 ?? can you give me the real C code? confused . maybe I didn't get it. sorry – Josh Morrison Jan 26 '11 at 03:54
`(type)` is a typecast to type `type`. For example, you could do `(unsigned int)-1 - (unsigned int) -1 / 2`. – bdonlan Jan 26 '11 at 03:58
i tried, but didn't get right result. what's wrong with me? can you give me a short program snippet? how can you prove this is right? confused still.. – Josh Morrison Jan 26 '11 at 04:35
1

What result did you get and what did you expect? On a system with 32-bit int, `(unsigned)-1-(unsigned)-1/2` gives me `0x80000000` as expected. – R.. GitHub STOP HELPING ICE Jan 26 '11 at 05:22
5

another possibility would be `~((unsigned)-1 >> 1)` – Christoph Jan 26 '11 at 12:43
@Christoph: Indeed. I'm just in a habit of avoiding the bitwise operators and always using arithmetic ones. (Note however that `<<` and `>>` are arithmetic and not bitwise, per C's specification of them.) – R.. GitHub STOP HELPING ICE Jan 26 '11 at 18:40
2

It's interesting how much more readable the answer is with spaces ala `(type)-1 - (type)-1 / 2`... :-/. – Tony Delroy Jan 28 '11 at 04:36

bdonlan · Answer 2 · 2011-01-26T09:06:56.627

5

First, note that there's no portable way to access the top bit if we're talking about signed integers; there's simply no single portable representation defined in the standard, so the meaning of 'top bit' can in principle vary. Additionally, C does not allow direct access to the bitwise representation; you can access the int as a char buffer, but you have no idea where the 'top bit' is located.

If we're only concerned with the non-negative range of a signed integer, and assuming said range has a size that is a power of two (if not, then we need to care about the signed representation again):

#define INT_MAX_BIT (INT_MAX - (INT_MAX >> 1))
#define SET_MAX_BIT(x) (x | INT_MAX_BIT)
#define CLEAR_MAX_BIT(x) (x & ~INT_MAX_BIT)

A similar approach can be used with unsigned ints, where it can be used to get the true top bit.

edited Jan 26 '11 at 09:06

answered Jan 26 '11 at 03:42

bdonlan

224,562
31
268
324

This only works for integer types that have limits specifies in `limits.h`. For instance, it does not work for `off_t`. – R.. GitHub STOP HELPING ICE Jan 26 '11 at 03:52
@R, true, but it's hard to see an efficient, portable method that doesn't use a `_MAX` macro and doesn't invoke undefined (or implementation-defined) behavior... – bdonlan Jan 26 '11 at 03:56
i'm sorry,can you explain it for me why "#define INT_MAX_BIT (INT_MAX - (INT_MAX >> 1))" works? I didn't get it – Josh Morrison Jan 26 '11 at 04:00
Typically `INT_MAX` has a value that has all 1s set (assuming INT_MAX is 2^n - 1 for some n). Shifting right one results in the top bit being cleared. Then subtracting from the original leaves only the top bit set. – bdonlan Jan 26 '11 at 04:08
1

The standard actually puts quite tight restrictions on the representation of integer types - for example, the maximum values of both signed and unsigned types must be `2**N - 1`. See section 6.2.6.2. – caf Jan 26 '11 at 08:58
Ah, I see; and it actually is restricted to one of three signed integer representations. Although there's nothing that says the sign bit must be the "top" bit :) – bdonlan Jan 26 '11 at 09:06

Tony Delroy · Answer 3 · 2011-01-26T05:34:56.703

2

Here's a silly one, using:

Built-in Function: int __builtin_clz (unsigned int x)

Returns the number of leading 0-bits in x, starting at the most
significant bit position. If x is 0, the result is undefined.

First attempt:

int get_msb(int x) { return x ? __buildin_clz(x) == 0 : 0; }

Note: it's a quirk of C that functions specifying int or unsigned int parameters can be called with the other type without warning. But, this probably involves a conversion - the C++ Standard 4.7.2 says:

If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type). [Note: In a two's complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). ]

Which implies that the bit pattern may be changed if it's not a two's complement representation, which would stop this "solution" working reliably too. :-(

Chris's comment below provides a solution (incorporated here as a function rather than preprocessor macro):

int get_msb(int x) { return x ? __buildin_clz(*(unsigned*)&x) == 0 : 0; }

edited Jan 26 '11 at 05:34

answered Jan 26 '11 at 04:18

Tony Delroy

102,968
15
177
252

The workaround would be `#define msb(x) __builtin_clz(*(unsigned)&x)` but then you can't use it on literal numbers. The GCC workaround would be `#define msb(x) ({ typeof(x) _x = x; __builtin_clz(*(unsigned)&_x); })` – Chris Lutz Jan 26 '11 at 05:12
@Chris: that's neat... in the function above the `int x` parameter would receive literals anyway, and you can then do the cast as you suggest. I'll update the code above. Thanks! – Tony Delroy Jan 26 '11 at 05:33
The cast is technically undefined behavior, which means it's not guaranteed to work. In practice, it will reinterpret the address of a signed value as an unsigned value (which is rightly undefined behavior since the standard doesn't specify a particular sign representation) much the same way as `union { int i; unsigned u; } u; u.i = x; return __builtin_clz(u.j);` would. I used a macro so that it would work on both signed and unsigned `int` types, invoking UB only for the signed version. But any way you do it will end up being necessarily platform dependent. – Chris Lutz Jan 26 '11 at 05:45
@Chris: yes, I knew that, but wouldn't be surprised if GCC itself doesn't work on any systems that quirky... :-). – Tony Delroy Jan 26 '11 at 05:51

ruslik · Answer 4 · 2011-01-26T03:54:14.563

1

What's wrong with this one?

int get_msb(int n){
    return ((unsigned)n) >> (sizeof(unsigned) * CHAR_BIT - 1);
    // or, optionally
    return n < 0;
};

int set_msb(int n, int msb){
    if (msb)
         return ((unsigned)n) |  (1ULL << (sizeof(unsigned) * CHAR_BIT - 1));
    else return ((unsigned)n) & ~(1ULL << (sizeof(unsigned) * CHAR_BIT - 1));
};

It takes care of endianness, number of bits in a byte, and works also on 1's complement.

edited Jan 26 '11 at 03:54

answered Jan 26 '11 at 03:44

ruslik

14,714
1
39
40

1

This assumes twos-complement representation. The OP requested a portable way :) – bdonlan Jan 26 '11 at 03:45
The C standard does not limit the options to "twos or ones complement". Per 6.5.7.5, the result of a right-shift of a negative value is completely implementation defined; the result of a left-shift outside of the range of positive values is undefined and could, in principle, even crash. – bdonlan Jan 26 '11 at 03:50
2

Also not portable in that `sizeof(X)*CHAR_BIT` assumes no padding bits. – R.. GitHub STOP HELPING ICE Jan 26 '11 at 03:51
3

By using [modular] arithmetic instead of bit hacks. – R.. GitHub STOP HELPING ICE Jan 26 '11 at 05:23

score 0 · Answer 5 · answered Jul 17 '15 at 08:23

0

#define HIGH_BIT(inttype) (((inttype)1) << (CHAR_BIT * sizeof(inttype) - 1))

example usage:

ptrdiff_t i = 4711;
i |=  HIGH_BIT(ptrdiff_t);  /* set high bit */
i &= ~HIGH_BIT(ptrdiff_t);  /* clear high bit */

answered Jul 17 '15 at 08:23

rapm

1
1

Please add some explanation. – Nilambar Sharma Jul 17 '15 at 08:30

What is the most portable way to read and write the highest bit of an integer in C?

5 Answers5

Linked