Algorithm to find the most significant bit

Question

A friend of mine was asked at an interview the following question: "Given a binary number, find the most significant bit". I immediately thought of the following solution but am not sure if it is correct.

Namely, divide the string into two parts and convert both parts into decimal. If the left-subarray is 0 in decimal then the do binary search in the right subarray, looking for 1.

That is my other question. Is the most significant bit, the left-most 1 in a binary number? Can you show me an example when a 0 is the most significant bit with an example and explanation.

EDIT:

There seems to be a bit of confusion in the answers below so I am updating the question to make it more precise. The interviewer said "you have a website that you receive data from until the most significant bit indicates to stop transmitting data" How would you go about telling the program to stop the data transfer"

Which side the MSB is on depends on whether your CPU architecture is Big-endian or Little-endian. — RBarryYoung, Jun 10 '13 at 15:53
Many platforms for a CountLeadingZero opcode that does this very fast. This is useful for division and log base 2 math. — Michael Dorgan, Jun 10 '13 at 15:53
Example: the number `0`. Also, you need to specify how negative numbers are represented. — rici, Jun 10 '13 at 15:54
thank you guys for the feedback and I don't mean to be disrespectful, but what you said really does not answer my question. Could I ask you to be more specific? Thanks — user2398832, Jun 10 '13 at 15:57
Why a string and decimal and all that? A binary search, and that's what it is basically, for the leftmost bit (please don't call it msb, that refers to the most significant position regardless of whether it is set or unset) is a good solution, but can be implemented much simpler. — harold, Jun 10 '13 at 15:59
@harold, the conversion to decimal is to check whether there is something in the left part of the original string. If the decimal number is larger than 0 then i perform binary search there, looking for the 1 — user2398832, Jun 10 '13 at 16:02
@user2398832 I get that, but you could just use bitmasks and never convert to anything, eg the first step is checking whether `x & 0xFFFF0000` is zero or not (for 32bit x, different mask for other sizes) — harold, Jun 10 '13 at 16:03
Most chipsets have [some sort of instruction](http://en.wikipedia.org/wiki/Find_first_set) to do this. — jason, Jun 10 '13 at 16:12
also see my answer to this question: http://stackoverflow.com/questions/14429661/determine-which-single-bit-in-the-byte-is-set/14429782#14429782 — John Dvorak, Jun 10 '13 at 16:17
If you only want to check if the top bit is set, just do `if(x & 1<<31)` (in case of 32-bit integers) — John Dvorak, Jun 10 '13 at 19:06
Way too complicated. In Java you test whether the most significant bit is set with `if(x < 0) …`. That’s all. — Holger, Nov 20 '13 at 13:15

Denis de Bernardy · Accepted Answer · 2013-06-10T16:18:53.177

13

You could also use bit shifting. Pseudo-code:

number = gets
bitpos = 0
while number != 0
  bitpos++             # increment the bit position
  number = number >> 1 # shift the whole thing to the right once
end
puts bitpos

if the number is zero, bitpos is zero.

edited Jun 10 '13 at 16:18

answered Jun 10 '13 at 15:55

Denis de Bernardy

75,850
13
131
154

You mean `while number != 0`? (I don't think pseudocode should follow C's idiosyncrasies.) – svick Jun 10 '13 at 16:09
@svick 0 = false and non-zero as true predates C i believe – im so confused Jun 10 '13 at 16:11
I don't understand this pseudocode.. Care to explain what lines 3, 5 and 6 represent? – user2398832 Jun 10 '13 at 16:11
1

hahaha guess it's not as universal as I thought then... @user2398832 `#3: while number != 0, #5: shift the number to the right by 1 bit, #6: output bitpos` – im so confused Jun 10 '13 at 16:12
@user2398832: more clear now? – Denis de Bernardy Jun 10 '13 at 16:20
Am reading about right shifting at the moment. Trying to figure out what it is and why you used it here. – user2398832 Jun 10 '13 at 16:21
@user2398832: right shifting turns e.g. `0b00010001` to `0b000010000`, then `0b000001000`, etc. At some point, you'll get `0b00000001` and finally `0b00000000`, which stops the while loop. – Denis de Bernardy Jun 10 '13 at 16:23
Thank you Dennis, makes sense now! This is a linear time complexity though. Why would my approach work though? – user2398832 Jun 10 '13 at 16:28
Edited the question. Please read the question. – user2398832 Jun 10 '13 at 16:32

score 13 · Answer 2 · edited Nov 28 '16 at 13:04

Finding the most significant bit in a word (i.e. calculating log2 with rounding down) by using only C-like language instructions can be done by using a rather well-known method based on De Bruijn sequences. For example, for a 32-bit value

unsigned ulog2(uint32_t v)
{ /* Evaluates [log2 v] */
  static const unsigned MUL_DE_BRUIJN_BIT[] = 
  {
     0,  9,  1, 10, 13, 21,  2, 29, 11, 14, 16, 18, 22, 25,  3, 30,
     8, 12, 20, 28, 15, 17, 24,  7, 19, 27, 23,  6, 26,  5,  4, 31
  };

  v |= v >> 1;
  v |= v >> 2;
  v |= v >> 4;
  v |= v >> 8;
  v |= v >> 16;

  return MUL_DE_BRUIJN_BIT[(v * 0x07C4ACDDu) >> 27];
}

However, in practice more simple methods (like unrolled binary search) usually work just as well or better.

score 2 · Answer 3 · answered Jun 10 '13 at 17:10

The edited question is really quite different, though not very clear. Who are "you"? The website or the programmer of the program that reads data from the website? If you're the website, you make the program stop by sending a value (but what, a byte, probably?) with its most-significant bit set. Just OR or ADD that bit in. If you're the programmer, you test the most-significant bit of the values you receive, and stop reading when it becomes set. For unsigned bytes, you could do the test like

bool stop = received_byte >= 128;
or
bool stop = received_byte & 128;

For signed bytes, you could use

bool stop = received_byte < 0;
or
bool stop = received_byte & 128;

If you're not reading bytes but, say, 32bit words, the 128 changes to (1 << 31).

twalberg · Answer 4 · 2013-06-10T16:28:05.643

This is one approach (not necessarily the most efficient, though, especially if your platform has a single-instruction solution to find-first-one or count-leading-zeros or something similar), assuming twos complement signed integers and a 32-bit integer width.

int mask = (int)(1U<<31); // signed integer with only bit 32 set
while (! n & mask)        // n is the int we're testing against
  mask >>= 1;             // take advantage of sign fill on right shift of negative number
mask = mask ^ (mask << 1) // isolate first bit that matched with n

If you want the bit position of that first one, simply add a integer counter that starts at 31 and gets decremented on each loop iteration.

One downside to this is if n == 0, it's an infinite loop, so test for zero beforehand.

Alessandro Jacopson · Answer 5 · 2019-12-06T07:32:10.250

If you are interested in a C/C++ solution you can have a look at the book "Matters Computational" by Jörg Arndt where you have these functions defined in section "1.6.1 Isolating the highest one and finding its index":

static inline ulong highest_one_idx(ulong x)
// Return index of highest bit set.
// Return 0 if no bit is set.
{
#if defined  BITS_USE_ASM
    return  asm_bsr(x);
#else  // BITS_USE_ASM

#if  BITS_PER_LONG == 64
#define MU0 0x5555555555555555UL  // MU0 == ((-1UL)/3UL) == ...01010101_2
#define MU1 0x3333333333333333UL  // MU1 == ((-1UL)/5UL)   == ...00110011_2
#define MU2 0x0f0f0f0f0f0f0f0fUL  // MU2 == ((-1UL)/17UL)  == ...00001111_2
#define MU3 0x00ff00ff00ff00ffUL  // MU3 == ((-1UL)/257UL)  == (8 ones)
#define MU4 0x0000ffff0000ffffUL  // MU4 == ((-1UL)/65537UL) == (16 ones)
#define MU5 0x00000000ffffffffUL  // MU5 == ((-1UL)/4294967297UL) == (32 ones)
#else
#define MU0 0x55555555UL  // MU0 == ((-1UL)/3UL) == ...01010101_2
#define MU1 0x33333333UL  // MU1 == ((-1UL)/5UL)   == ...00110011_2
#define MU2 0x0f0f0f0fUL  // MU2 == ((-1UL)/17UL)  == ...00001111_2
#define MU3 0x00ff00ffUL  // MU3 == ((-1UL)/257UL)  == (8 ones)
#define MU4 0x0000ffffUL  // MU4 == ((-1UL)/65537UL) == (16 ones)
#endif

    ulong r = (ulong)ld_neq(x, x & MU0)
        + ((ulong)ld_neq(x, x & MU1) << 1)
        + ((ulong)ld_neq(x, x & MU2) << 2)
        + ((ulong)ld_neq(x, x & MU3) << 3)
        + ((ulong)ld_neq(x, x & MU4) << 4);
#if  BITS_PER_LONG > 32
    r += ((ulong)ld_neq(x, x & MU5) << 5);
#endif
    return r;

#undef MU0
#undef MU1
#undef MU2
#undef MU3
#undef MU4
#undef MU5
#endif
}

where asm_bsr is implemented depending on your processor architecture

// i386
static inline ulong asm_bsr(ulong x)
// Bit Scan Reverse: return index of highest one.
{
    asm ("bsrl %0, %0" : "=r" (x) : "0" (x));
    return x;
}

or

// AMD64
static inline ulong asm_bsr(ulong x)
// Bit Scan Reverse
{
    asm ("bsrq %0, %0" : "=r" (x) : "0" (x));
    return x;
}

Go here for the code: http://jjj.de/bitwizardry/bitwizardrypage.html

EDIT:

This is the definition in the source for function ld_neq:

static inline bool ld_neq(ulong x, ulong y)
// Return whether floor(log2(x))!=floor(log2(y))
{ return ( (x^y) > (x&y) ); }

score -1 · Answer 6 · answered Jun 10 '13 at 16:01

-1

I don't know it this is too much tricky :)

I would convert the binary number to dec and then I would return the logaritm base 2 of the number directly (converted from float to int).

The solution is the (returned number + 1) bit starting from the right.

To your answer as far as I know its the left-most 1

answered Jun 10 '13 at 16:01

javier_domenech

5,995
6
37
59

Your algorithm is subject to rounding errors. That's no good. If you don't do any rounding at all, it's entirely wrong as the position is always integer. – John Dvorak Jun 10 '13 at 16:02
@JanDvorak its correct if you truncate instead of rounding, which a direct cast would do.... – Bob Vale Jun 10 '13 at 16:04
Yes, I was thinking of truncate... lapsus. But still remains an error if the number is 0 – javier_domenech Jun 10 '13 at 16:06
@BobVale still not. Rounding errors will cause you to sometimes underestimate the result by one – John Dvorak Jun 10 '13 at 16:06
well Dvorak, do you have the solution? You seem to be going around downvoting people's answer without offering a solution of your own :) – user2398832 Jun 10 '13 at 16:06
@user2398832 I upvoted Denis' answer – John Dvorak Jun 10 '13 at 16:06
no lapsus... I never said 'round' :) – javier_domenech Jun 10 '13 at 16:07
@JanDvorak Specifics, where would it under estimate? – Bob Vale Jun 10 '13 at 16:07
I said "round". If the log-2 implementation decides (it well could) that log_2(8) is 2.9999999993, then this algorithm will claim the top bit of 8 is at the second position (bit 1) whereas it's on the third (bit 2) – John Dvorak Jun 10 '13 at 16:09
Why would you want to convert to base 10 before calculating the logarithm? – svick Jun 10 '13 at 16:12
@BobVale similarly, any power of two is subject to errors. Not to mention that you should handle 0 gracefully. `log_2(0)` is `-inf`. You'd have to check what `int(-inf)` is or special-case zero. – John Dvorak Jun 10 '13 at 16:13
@JanDvorak ok, that makes sense now – Bob Vale Jun 10 '13 at 16:16
No need to be so strict... it's just one more point of view. Remember that humans aren't binaries – javier_domenech Jun 10 '13 at 16:17
@vivoconunxino umm... you have to be strict when telling the computer how to do something. Unless you can find an exact (up to less than 1 ulp) implementation of `log_2` (hint: you won't), this algorithm can produce incorrect results. – John Dvorak Jun 10 '13 at 16:19
Ok Jan, wroted down. Just wanted to remark, this is not an implementation, there is no code anywhere, just an idea – javier_domenech Jun 10 '13 at 16:21
Edited the question. Please read the question. – user2398832 Jun 10 '13 at 16:33

Stochastically · Answer 7 · 2013-06-10T16:31:41.067

-2

I tihnk this is kind of a trick question. The most significant bit is always going to be a 1 :-). If interviewers like lateral thinking, that answer should be a winner!

edited Jun 10 '13 at 16:31

answered Jun 10 '13 at 15:59

Stochastically

7,616
5
30
58

3

Even if the interpretation was correct (I believe it isn't), the answer is wrong. – John Dvorak Jun 10 '13 at 16:00
does that mean that my solution is correct? It has O(lgn) complexity since I'm doing a binary search. – user2398832 Jun 10 '13 at 16:01
If you interpret the question in the same way as "what is the most significant digit", then the answer must be 1! If that's right, then as I said, it's a trick question and no binary search is necessary :-). @JanDvorak, your comment is cryptic since you haven't given any answer yourself? – Stochastically Jun 10 '13 at 16:07
Even if it's a trick question, the answer isn't always `1`. In case of `0`, it's `0`. – John Dvorak Jun 10 '13 at 16:10
@Stochastically, so what do you suggest? Doing a linear search from left to right to find the first occurence of 1? Cuz if it is, then binary search is better in asymptotic complexity. – user2398832 Jun 10 '13 at 16:10
@JanDvorak that's like saying the most significant digit of "0987" is 0 when in fact it's a 9 because the 0 doesn't count. – Stochastically Jun 10 '13 at 16:12
3

@Stochastically no it isn't, it's like saying that the most significant digit in zero can't be a 1 because *there is no 1 in zero* – harold Jun 10 '13 at 16:13
2

@harold *zero* is obviously a special case. For non-zero numbers, the most significant *digit* of a binary number is going to be a 1. But if the interpretation of the question is the *position* of the most significant digit, then something like the answer of Denis is what's needed. – Stochastically Jun 10 '13 at 16:21
3

Of course, no argument there. I don't think either the OP or the interviewer intended for this to be the answer, but I thought it was funny and didn't downvote it. – harold Jun 10 '13 at 16:22

Algorithm to find the most significant bit

EDIT:

7 Answers7