How can I generate de Bruijn sequences iteratively?

Question

I am looking for a way to generate a de Bruijn sequence iteratively instead of with recursion. My goal is to generate it character by character.

I found some example code in Python for generating de Bruijn sequences and translated it into Rust. I am not yet able to comprehend this technique well enough to create my own method.

Translated into Rust:

fn gen(sequence: &mut Vec<usize>, a: &mut [usize], t: usize, p: usize, k: usize, n: usize) {
    if t > n {
        if n % p == 0 {
            for x in 1..(p + 1) {
                sequence.push(a[x])
            }
        }
    } else {
        a[t] = a[t - p];
        gen(sequence, a, t + 1, p, k, n);
        for x in (a[t - p] + 1)..k {
            a[t] = x;
            gen(sequence, a, t + 1, t, k, n);
        }
    }
}

fn de_bruijn<T: Clone>(alphabet: &[T], n: usize) -> Vec<T> {
    let k = alphabet.len();
    let mut a = vec![0; n + 1];
    let vecsize = k.checked_pow(n as u32).unwrap();
    let mut sequence = Vec::with_capacity(vecsize);
    gen(&mut sequence, &mut a, 1, 1, k, n);
    sequence.into_iter().map(|x| alphabet[x].clone()).collect()
}

However this is not able to generate iteratively - it goes through a whole mess of recursion and iteration which is impossible to untangle into a single state.

Why do you need to generate them iteratively? It seems like using recursion and Rust's generators/coroutines would be an easy solution. — Richard, Jul 03 '19 at 17:06
@Richard I'm hoping for something that works on stable as i'm not sure whether generators will remain the same until release. But if I cant find anything that does seem like a good idea. — Plasma_000, Jul 03 '19 at 17:11
@Richard [Lazy sequence generation in Rust](https://stackoverflow.com/a/30279122/155423) — Shepmaster, Jul 03 '19 at 17:37

score 0 · Answer 1 · edited Jul 05 '19 at 15:10

I am not familiar with Rust, so I programmed and tested it in Python. Since the poster translated the version in the question from a Python program, I hope it will not be a big issue.

# the following function treats list a as
# k-adic number with n digtis
# and increments this number returning
# the index of the leftmost digit changed
def increment_a7(a, k, n):
    digit= n-1
    a[digit]+= 1
    while a[digit] >= k and digit> 0:
        #a[digit]= 0
        a[digit]= a[0]+1
        a[digit-1]+= 1
        digit-= 1
    return digit

# the following function adds a to the sequence
# and takes into account, that the beginning of a
# could overlap with the end of sequence
# in that case, it just removes the overlapping digits
# from a before adding the remaining digits to sequence
def append_to_sequence(sequence, a, n):
    # here we can assume safely, that a
    # does not overlap completely with sequence[-n:]
    i= -1
    for i in range(n-1, -1, -1):
        found= True
        # check if the last i digits in sequence
        # overlap with the first i digits in a
        for j in range(i):
            if a[j] != sequence[-i+j]:
                # no, they don't overlap
                found= False
                break
        if found:
            # yes they overlap, so no need to 
            # continue the check with a smaller i
            break
    # now we can just append everything from
    # digit i (digit 0 - i-1 are swallowed)
    sequence.extend(a[i:])    
    return n-i

# during the operation we have to keep track of
# the k-adic numbers a, that already occured in
# the sequence. We store them in a set called used
# everytime we add something to the sequence
# we have to update it and add one entry for each
# digit inserted
def update_used(sequence, used, n, num_inserted):
    l= len(sequence)
    for i in range(num_inserted):
        used.add(tuple(sequence[-n-i:l-i]))

# the main work is done in the following function
# it creates and returns the generated sequence
def gen4(k, n):
    a= [0]*n
    sequence= a[:]
    used= set()
    # create a fake sequence to add the segments obtained by the cyclic nature
    fake= ([k-1] * (n-1))
    for i in range(n-1):
        fake.append(0)
        update_used(fake, used, n, 1)
    update_used(sequence, used, n, 1)
    valid= True
    while valid:
        # a is still a valid k-adic number
        # this means the generation process
        # has not ended
        # so construct a new number from the n-1
        # last digits of sequence
        # followed by a zero
        a= sequence[-n+1:]
        a.append(0)
        while valid and tuple(a) in used:
            # the constructed k-adict number a
            # was already used, so increment it
            # and try again
            increment_a(a, k, n)
            valid= a[0]<k
        if valid:
            # great, the number is still valid
            # and is not jet part of the sequence
            # so add it after removing the overlapping
            # digits and update the set with the segments
            # we already used
            num_inserted= append_to_sequence(sequence, a, n)
            update_used(sequence, used, n, num_inserted)
    return sequence

I tested the code above by generating some sequences with the original version of gen and this one using the same parameters. For all sets of parameters I tested, the result was the same in both versions.

Please note that this code is less efficient than the original version, especially if the sequence gets long. I guess the costs of the set operations has a non-linear influence on the runtime.

If you like, you can improve it further such as by using a more efficient way to store the used segments. Instead of operating on the k-adic representation (the a-list), you could use a multidimensional array instead.

Thank you - this works, but I was hoping for a solution that wasn't bruteforce (this kind of solution quickly explodes in complexity) — Plasma_000, Jul 05 '19 at 02:25

score 0 · Answer 2 · edited Jul 05 '19 at 15:08

Consider this approach:

Choose the first (lexicographically) representative from every necklace class

Here is Python code for generation of representatives for (binary) necklaces containing d ones (it is possible to repeat for all d values). Sawada article link
Sort representatives in lexicographic order
Make periodic reduction for every representative (if possible): if string is periodic s = p^m like 010101, choose 01

To find the period, it is possible to use string doubling or z-algorithm (I expect it's faster for compiled languages)
Concatenate reductions

Example for n=3,k=2:
Sorted representatives: 000, 001, 011, 111
Reductions: 0, 001, 011, 1
Result: 00010111

The same essential method (with C code) is described in Jörg Arndt's book "Matters Computational", chapter 18

A similar way is mentioned in the wiki

An alternative construction involves concatenating together, in lexicographic order, all the Lyndon words whose length divides n

You might look for effective way to generate appropriate Lyndon words

@Lance Now I'm too far this problem. I can make Python code to generate one sequence as described, but don't remember (or don't know) how to generate all sequences. — MBo, Apr 21 '22 at 10:04

How can I generate de Bruijn sequences iteratively?

2 Answers2