5

I'm looking for an algorithm to solve, or at least a proper name for the following problem:


I have a set B of bitstrings. The algorithm should find a minimal (defined as "having fewest bits set") bitstring S such that:

For all b in B, there exists a shift N (in ) such that (S << N) & b == b.


If it helps, each b fits in a machine word, and |B| is on the order of a couple hundred.


I think we can assume (without loss of generality) that the LSB of S and each b is 1.

This looks to me like some kind of multiple sequence-alignment problem.

If we can find each Ni for each bi in B (i = 1 .. |B|), it looks like S is just the bitwise-or across all (bi >> Ni).

My intuition is, the first step is to remove every b from B for which there exists another bitstring c in B and some shift M such that b & (c << M) == b. What's next?

AlliedEnvy
  • 376
  • 1
  • 5
  • 1
    The shift amount being in ℤ is interesting, does that mean that a negative left shift acts as a right shift? – harold Apr 14 '16 at 16:06
  • I think this is equal to ["Shortest common supersequence problem"](https://en.wikipedia.org/wiki/Shortest_common_supersequence_problem) for a set of strings. It is NP-Hard in general, but for your particular case it should not be too hard to solve it. – Evgeny Kluev Apr 14 '16 at 16:09
  • 1
    @harold Yes, negative left shifts act as right shifts. – AlliedEnvy Apr 14 '16 at 16:16
  • @EvgenyKluev Thanks, this does look related to that problem. The main difference is that a 1 in the supersequence can match a 0 in the set of sequences to be matched. I suppose the supersequence could consist of Xs and 0s, where X means "don't care" and 0 means "must have a 0". – AlliedEnvy Apr 14 '16 at 16:23
  • Another important difference to the "shortest common supersequence problem" which tries to minimize the length, is that AlliedEnvy tries to minimize the number of bits that are set to 1. – maraca Apr 15 '16 at 22:24

1 Answers1

0

My specific instance of B was small enough to be amenable to brute force, with a few tricks to prune the search.

With the following functions already defined,

  • snoob, returns the next-highest number with the same number of bits set (as defined in Hacker's Delight fig. 2-1 (or originally, HAKMEM item 175))
  • popcount, returns the number of 1-bits in its argument
  • clz, returns the number of consecutive zeroes in the most-significant end of its argument

Pseudocode for my solution is as follows:

min_ones = max popcount(b) for b in B
max_ones = popcount(~0)

for i = 0 .. |B|-1:
    while !(B[i] & 1): B[i] >>= 1

found_ones = false
for ones = min_ones .. max_ones:
    if found_ones: break
    for S = (1 << ones)-1; clz(S) > 0; S = snoob(S):
        if !(S & 1): continue
        for b in B:
            found = false
            for N = 0 .. clz(b) - clz(S):
                if (S >> N) & b == b:
                    found = true
                    break
            if !found: break
         if found:
             print(S)
             found_ones = true

The first loop shifts right each b so its LSB is 1; this allows us to only use right shifts for N later.

The loop over S starts with the smallest number with ones bits set; the loop stop condition isn't quite correct, but it works well enough for my purposes.

The loop over N starts with the LSB of S and b aligned, and then goes to the most-significant one-bits of S and b aligned.


For now, I'm leaving the question as unsolved to see if a proper non-brute-force solution comes around, or until someone says the problem is NP-hard.

AlliedEnvy
  • 376
  • 1
  • 5