4

I need a function with a header like this:

bool is_prefix(int a, int b, int* c) {
    // ...
}

If a is, read as a binary number string, a prefix of b, then set *c to be the rest of b (i.e. "what b has more than a") and return true. Otherwise, return false. Assume that binary strings always start with "1".

Of course - it is easy to do by comparing bit by bit (leftshift b until b==a). But is there a solution which is more efficient, without iterating over the bits?

Example: a=100 (4), b=1001 (9). Now set *c to 1.

Johannes
  • 2,901
  • 5
  • 30
  • 50
  • Can you give a numeric example? Should the overlap be on the high or the low end of b? – Pixelchemist Jun 18 '13 at 16:07
  • Does the first bit of the prefix have to begin with `1`? Because unless the bits in `a` are in reverse order to those in `b`, I don't see how you'd be able to limit the bits in `a` to just the prefix as `a` and `b` are going to be the same size as integer values. – JAB Jun 18 '13 at 16:08
  • Shift the low-order bits of b off to the right until a and b are equal. – Hot Licks Jun 18 '13 at 16:08
  • @HotLicks that's basically comparing bit by bit, except it makes `a` not being a prefix a worst-case scenario as `b` would have to be shifted and compared `sizeof(int)*8` times before you could be sure `a` wasn't the prefix. – JAB Jun 18 '13 at 16:08
  • @JAB - No, it's comparing a max of N times for an N-bit-wide value. – Hot Licks Jun 18 '13 at 16:10
  • How does your easy solution work? – n. m. could be an AI Jun 18 '13 at 16:10
  • @Pixelchemist I added an example. – Johannes Jun 18 '13 at 16:12
  • @JAB yes, consider a and b as beginning with 1. – Johannes Jun 18 '13 at 16:13
  • @JAB - You can, of course, "short-circuit" things a bit by testing for zero leading bytes in the arguments and by giving up when b has become smaller than a. But the basic algorithm is sound, and reasonably fast. – Hot Licks Jun 18 '13 at 16:16
  • @n.m. Basically, get log2 of both numbers, then position readers on the first bits of both, then move both readers forward bit by bit. – Johannes Jun 18 '13 at 16:16
  • Note that taking the log2 or some such will be slower than iterating over bits in most cases. – Hot Licks Jun 18 '13 at 16:18
  • @HotLicks A more consistent case would be shifting up such that the first bits of both `a` and `b` are 1 (could be determined using a basic numeric calculation if I'm not mistaken, but then again maybe I am, but that in itself could possibly be obviated by using a reference array/etc.), then checking if `a` is able to mask the number of upper bits of `b` equal to `sizeof(int) * 8 - ashift`), and if so take that masked value of `b` and shift it down back to normal and store in `c`. – JAB Jun 18 '13 at 16:20
  • 3
    That's a hard solution. An easy solution would be to find such a number N that a>>N==b. Iff such N exists, b is a prefix of a. You can use binary search if linear search over bits-per-your-integral-type is too slow. – n. m. could be an AI Jun 18 '13 at 16:23
  • @n.m. Binary search seems to be a useful idea! – Johannes Jun 18 '13 at 16:25
  • @JAB - There are a half-dozen different strategies for short-circuiting, but they all risk being more expensive than a straight-forward solution. – Hot Licks Jun 18 '13 at 16:26
  • It may or may not give you a performance increase (my bet is on "not" but that's just a guess). – n. m. could be an AI Jun 18 '13 at 16:32
  • 1
    Looping over a maximum of 64 (or 32) bits is very unlikely to be any kind of performance issue. You should aim for portability and readability instead. – oseiskar Jun 18 '13 at 16:52
  • @oseiskar Depends on how often the function will be used, really. If it's called repeatedly, using a basic optimization to reduce the calculation time is probably not a bad idea (assuming the compiler wouldn't be doing it in the first place, which could be a possibility). Going by http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogLookup, it seems using a lookup table would be more efficient than down-shifting `b` when `a` is small and `b` is large, though Johannes' example utilizes a small `b`. – JAB Jun 18 '13 at 17:09
  • (And a case with `b` and `a` both large but being only a few bits different in length would similarly have potential to be quicker than the lookup table.) – JAB Jun 18 '13 at 17:10
  • Someone once said something about "premature optimization". – Hot Licks Jun 20 '13 at 00:23

2 Answers2

3

You can use your favorite "fast" method to find the highest set bit. Let's call the function msb().

bool is_prefix (int a, int b, int *c) {
    if (a == 0 || b == 0 || c == 0) return false;
    int d = msb(b) - msb(a);
    if (d < 0) return false;
    if ((b >> d) == a) {
        *c = b ^ (a << d);
        return true;
    }
    return false;
}

Shift b so its high order bit aligns with a, and compare that with a. If they are equal, then a is a "prefix" of b.

This algorithm's performance depends on the performance of msb(). If it is constant, then this algorithm is constant. If msb() is expensive, then the "easy approach" may be the fastest approach.

Community
  • 1
  • 1
jxh
  • 69,070
  • 8
  • 110
  • 193
0

I'm not too sure, but would something like the following work:

bool
is_prefix( unsigned a, unsigned b, unsigned* c )
{
    unsigned mask = -1;
    while ( mask != 0 && a != (b & mask) ) {
        a <<= 1;
        mask <<= 1;
    }
    c = b & ~mask;
    return mask != 0;
}

(Just off the top of my head, so there could be errors.)

James Kanze
  • 150,581
  • 18
  • 184
  • 329