1

I have two algorithms that solve this problem: Generate all sequences of bits within Hamming distance t. Now I want to compare them theoretically (I do have time measurements, if needed).

The iterative algorithm has a complexity of:

O((n choose t) * n)

where n is the length of the bit-string and t is the desired Hamming distance.

The recursive algorithm, they best we have so far is:

O(2^n)

but how to compare these two Time Complexities, without introducing t inside the second Time Complexity? For that reason, I am trying to do that, can you help?

The recursive algorithm:

// str is the bitstring, i the current length, and changesLeft the
// desired Hamming distance (see linked question for more)
void magic(char* str, int i, int changesLeft) {
        if (changesLeft == 0) {
                // assume that this is constant
                printf("%s\n", str);
                return;
        }
        if (i < 0) return;
        // flip current bit
        str[i] = str[i] == '0' ? '1' : '0';
        magic(str, i-1, changesLeft-1);
        // or don't flip it (flip it again to undo)
        str[i] = str[i] == '0' ? '1' : '0';
        magic(str, i-1, changesLeft);
}
Community
  • 1
  • 1
gsamaras
  • 71,951
  • 46
  • 188
  • 305
  • n choose t is n!/(t!.(n-t)!) This has a minimum of 1 (with t==0), and a maximum of n!/(n/2)!² (with t==n/2). – Martin Bonner supports Monica Dec 12 '16 at 17:03
  • No. t==n/2 is where the iterative complexity is maximum! – Martin Bonner supports Monica Dec 12 '16 at 17:07
  • interesting none of the answering algorithms use `next_permutation` on another vector of bits. – jaggedSpire Dec 12 '16 at 17:08
  • Finally, observe that the recursive complexity is better than 2^n for small t. If t==0, it will return imediately, and if t=1 the "flipped" branch won't recurse further. In addition, if you check `changesLeft > i` and return, you can return early if t is close to n. – Martin Bonner supports Monica Dec 12 '16 at 17:09
  • Because I don't have an answer - just a set of observations. – Martin Bonner supports Monica Dec 12 '16 at 17:20
  • oh, they don't use next_permutation because vector is a special snowflake with weird iterators. – jaggedSpire Dec 12 '16 at 17:28
  • Is the second one really independent of `t`? Is this even possible? (asking, knowing nothing about the subject matter). – Will Ness Dec 12 '16 at 17:42
  • I don't think so @WillNess, it's just lack of analysis, that's why I posted here for help, to somehow introduce `t` in Time Complexity, so that I can later compare the two algorithms! – gsamaras Dec 12 '16 at 17:45
  • ah, thank you. You can compare the slices for several fixed `t`s, even empirically, i.e. time measurements + [Empirical Orders of Growth](http://en.wikipedia.org/wiki/Analysis_of_algorithms#Empirical_orders_of_growth) in `n`; though if it's really exponential then you need to tweak the formulas, i.e. to calculate not the power coefficient, but the exponent coefficient, and draw the plot of that coefficient vs `n` for each `t`, to "see" which algo is the better behaving one. To compare, O(n^2) is ~n^2.0 (duh) and O(n log n) is usually ~n^1.15..1.05, depending on `n`. – Will Ness Dec 12 '16 at 17:54
  • Wouldn't it be simpler to insert `t` into the time complexity of the recursive algorithm? I thank you for taking the time to help @WillNess! – gsamaras Dec 12 '16 at 17:57
  • Ah, I misunderstood. You want all patterns within the Hamming distance t, not with an exact Hamming distance of t. – samgak Dec 12 '16 at 21:07
  • @samgak that's okay. You can see the exact output of the algorithm on the linked question. :) Do you have any idea on how to introduce `t` into the Time Complexity of the recursive version? – gsamaras Dec 12 '16 at 21:10
  • I tried running the code, and it outputs patterns with the exact Hamming distance t, not within Hamming distance t. For example if you give it the string `"0000"` and t = 2, it won't print out `0001`, but it will print out `0011`. – samgak Dec 12 '16 at 21:39
  • It has to be called in a loop for `i = 1; i <= t` @samgak to give all the distances. – gsamaras Dec 12 '16 at 21:41
  • Ok. So does the time complexity include that outer loop? You wouldn't need the loop if you changed your first test to `if ((changesLeft == 0) || (i < 0))` (except that it will print out the original string with a hamming distance of zero) – samgak Dec 12 '16 at 21:45
  • 1
    No @samgak (sorry for the confusion, really). Fix `t`, let's say to `n/2` and compute the Time Complexity of the function, this would be enough, I guess. – gsamaras Dec 12 '16 at 21:50

2 Answers2

1

At the most general level of time complexity, we have a "worst case" of t = n/2. Now, fix t and gradually increment n. Let's take a starting point of n=8, t=4

C(8 4) = 8*7*6*5*4*3*2*1 / (4*3*2*1 * 4*3*2*1)
    = 8*7*6*5 / 24
n <= n+1 ... n choose t is now

C(9 4) = ...
    = 9*8*7*6 / 24
    = 9/5 of the previous value.

Now, the progression is a little easier to watch.

C( 8 4) = 8*7*6*5 / 24
C( 9 4) =  9/5 * C( 8 4)
C(10 4) = 10/6 * C( 9 4)
C(11 4) = 11/7 * C(10 4)
...
C( n 4) = n/(n-4) * C(n-1 4)

Now, as lemmas for the student:

  • Find the base complexity, n! / ( (n-1)! ^ 2)
  • Find the combinatorial complexity of product (n / (n-c)) for constant c
Prune
  • 76,765
  • 14
  • 60
  • 81
1

The recursive algorithm is O((n choose t) * n) too, by an analysis that charges to each printed combination the cost of the entire call stack at the time that it is printed. We can do this because every invocation of magic (except the two O(1) leaf calls where i < 0, which we could easily do away with) prints something.

This bound is best possible if you assign printing its true cost. Otherwise, I'm pretty sure that both analyses can be tightened to O(n choose t) excluding printing for t > 0, with details in Knuth 4A.

David Eisenstat
  • 64,237
  • 7
  • 60
  • 120