4

I would like to solve a minimum set cover problem of the following sort. All the lists contain only 1s and 0s.

I say that a list A covers a list B if you can make B from A by inserting exactly x symbols.

Consider all 2^n lists of 1s and 0s of length n and set x = n/3. I would to compute a minimal set of lists of length 2n/3 that covers them all.

Here is a naive approach I have started on. For every possible list of length 2n/3 I create a set of all lists I can create from it using this function (written by DSM).

from itertools import product, combinations

def all_fill(source, num):
    output_len = len(source) + num
    for where in combinations(range(output_len), len(source)):
        # start with every possibility
        poss = [[0,1]] * output_len
        # impose the source list
        for w, s in zip(where, source):
            poss[w] = [s]
        # yield every remaining possibility
        for tup in product(*poss):
            yield tup

I then create the set of sets as follows using n = 6 as an example.

n = 6
shortn = 2*n/3
x = n/3
coversets=set()
for seq in product([0,1], repeat = shortn):
    coversets.add(frozenset(all_fill(seq,x)))

I would like to find a minimal set of sets from coversets whose union is allset = set(product([0,1], repeat=n)).

In this case, set(all_fill([1,1,1,1],2)), set(all_fill([0,0,0,0],2)), set(all_fill([1,1,0,0],2)), set(all_fill([0,0,1,1],2)) will do.

My aim is to solve the problem for n = 12. I am happy to use external libraries if that will help and I expect the time to be exponential in n in the worst case.

Simd
  • 19,447
  • 42
  • 136
  • 271
  • I'm not actually sure what you're asking. Could you explain it more graphically or something? – Veedrac Sep 24 '13 at 13:12
  • @Veedrac How about a smaller example. Set n = 3 and x = 1 so there are 8 possible lists of 1s and 0s. I want to find the smallest set of lists of length 2 that cover all 8 of these. So take [0,0], [1,1]. We can make any list of length 3 by just adding one 1 or 0 to either of those two lists. Is that any clearer? My approach is to convert it into a minimal set cover problem. – Simd Sep 24 '13 at 13:16
  • I suggest trying (or at least starting with) pairs of strings x and y such that y = reversebits(x) = not(x). I *think* the symmetry of the problem means that if some string x is part of a minimal covering set but does not cover every string by itself, then it's possible to make a minimal covering set by adding reversebits(x) and not(x) (as well as possibly other strings). If true then choosing string pairs constrained in this way means you'll need half as many strings as you otherwise would. – j_random_hacker Sep 24 '13 at 16:05
  • And here is an easy upper bound: if you have a set of k strings S[i], 1 <= i <= k, that covers all lists of length n, then you can build a size 4k solution for lists of length n+3 using `{ 0.S[i].0, 1.S[i].1, 0.S[i].1, 1.S[i].0 }` for all 1 <= i <= k, where `.` represents concatenation. Obviously this is loose, since from your examples we have f(3) = 2 but f(6) = 4 < 8. – j_random_hacker Sep 24 '13 at 16:17
  • @j_random_hacker Thanks. I really want an exact algorithm here. I have good upper and lower bounds but I would like to compute optimal solutions for small cases too. – Simd Sep 24 '13 at 16:52
  • Fair enough. The straightforward way is the usual brute-force recursion: first decide the first 2n/3-length string (there are 2^(2n/3) possible choices); for each, fix that choice and then decide the next 2n/3-length string (which can be one of 2^(2n/3)-1 possible choices), etc. After adding each string, test whether all 2^n strings are generated (it may be faster to keep a list of the strings that cannot be generated with the strings chosen so far, and whittle this down). This can be much improved by always choosing strings in *increasing* lexicographical order, though it's still exponential. – j_random_hacker Sep 24 '13 at 21:34
  • Another speedup: both the string consisting of 2n/3 `1`s and the string consisting of 2n/3 `0`s must always be included in any cover, because otherwise there's no way to generate the string consisting of n `1`s, or of n `0`s. – j_random_hacker Sep 24 '13 at 21:42

1 Answers1

3

I’ve written a small program to write an integer program to be solved by CPLEX or another MIP solver. Below it is a solution for n=12.


from collections import defaultdict
from itertools import product, combinations

def all_fill(source, num):
    output_len = (len(source) + num)
    for where in combinations(range(output_len), len(source)):
        poss = ([[0, 1]] * output_len)
        for (w, s) in zip(where, source):
            poss[w] = [s]
        for tup in product(*poss):
            (yield tup)

def variable_name(seq):
    return ('x' + ''.join((str(s) for s in seq)))
n = 12
shortn = ((2 * n) // 3)
x = (n // 3)
all_seqs = list(product([0, 1], repeat=shortn))
hit_sets = defaultdict(set)
for seq in all_seqs:
    for fill in all_fill(seq, x):
        hit_sets[fill].add(seq)
print('Minimize')
print(' + '.join((variable_name(seq) for seq in all_seqs)))
print('Subject To')
for (fill, seqs) in hit_sets.items():
    print(' + '.join((variable_name(seq) for seq in seqs)), '>=', 1)
print('Binary')
for seq in all_seqs:
    print(variable_name(seq))
print('End')

MIP - Integer optimal solution:  Objective =  1.0000000000e+01
Solution time =    7.66 sec.  Iterations = 47411  Nodes = 337

CPLEX> Incumbent solution
Variable Name           Solution Value
x00000000                     1.000000
x00000111                     1.000000
x00011110                     1.000000
x00111011                     1.000000
x10110001                     1.000000
x11000100                     1.000000
x11001110                     1.000000
x11100001                     1.000000
x11111000                     1.000000
x11111111                     1.000000
All other variables matching '*' are 0.
CPLEX> 
David Eisenstat
  • 64,237
  • 7
  • 60
  • 120