Creating a list of possibilities from splitting strings at a certain character - Karnaugh Map

Question

I am trying to create "truth tables" for use in a Karnaugh map solver. The Karnaugh Map solver takes in the settings of the variables ABCD as a string of 4 numbers representing what they are set to. For example, ABCD could be anything from "0000" to "1111." The solver takes in the minterms (settings of ABCD which equal 1 in the truth table), but it outputs "*000" or other examples with * to represent that the Kmap has removed the first variable in the resulting equation. I am using this for an iterative process, so I need to be able to convert lists of these asterix-filled strings to a list of fully expanded strings :

[*000, 0*00] -> [0000, 1000, 0100]

The way I am currently doing it (with while loops) works but is very slow for the application I am using it for. It takes at least 10 minutes to get through, because I need to do this process around 1000 times (one for each piece of the dataset I am using) It goes quickly until around 430, when it slows significantly because of how long it takes to expand these complicated sequences. I can also confirm that it is not the Karnaugh map implementation I am using, because that solves the same long sequences instantly.

I'm not sure of a more pythonic way to do this which also accounts for things like "*11*", which have multiple spots where the expression needs to be expanded:

[*11*] -> [011*, 111*] -> [0110, 0111, 1110, 1111]

Any ideas which are efficient and pythonic?

My Code:

        ast = True
        while ast:
            new_string_result = copy.deepcopy(string_result)
            for i in range(len(string_result)):
                for c, char in enumerate(string_result[i]):
                    if char == "*":
                        # replace with 0 and 1
                        new_string_result.append(string_result[i][:c] + "0" + string_result[i][c+1:])
                        new_string_result.append(string_result[i][:c] + "1" + string_result[i][c+1:])

                if "*" in string_result[i]:    
                    # remove original (only once, even if multiple *)
                    # print("Removing ", string_result[i])
                    new_string_result.remove(string_result[i])
                    
            # print("Kmap result during fix iter: ", new_string_result)
            
            ast_found = False
            for i in range(len(new_string_result)):
                if "*" in new_string_result[i]:
                    ast_found = True
            
            # print(ast_found)
            ast = ast_found
            string_result = new_string_result

Please post your actual code, and define what you mean by "very slow". — Scott Hunter, Jun 07 '23 at 16:17
I think you are missing a `break` after `new_string_result.append(string_result[i][:c] + "1" + string_result[i][c+1:])` Otherwise, expanding `**` gives me 8 results. — Nick ODell, Jun 07 '23 at 16:53
@user19077881, I just ran my program with the input ['*110'] and it seems to have worked fine, returning ['0110' , '1110']. I just set string_result = ['*110'] prior to the while loop and printed the result after the while loop is complete. — Alenna Spiro, Jun 07 '23 at 17:47

score 1 · Accepted Answer · answered Jun 07 '23 at 18:50

I found a couple of ideas that are 2x-3x faster.

Reference code

This is the code from your original question, with a break statement added to remove duplicates from the output. (If you don't do this, then expanding **** gives 384 results.)

def kmap_orig(string_result):
    ast = True
    while ast:
        new_string_result = copy.deepcopy(string_result)
        for i in range(len(string_result)):
            for c, char in enumerate(string_result[i]):
                if char == "*":
                    # replace with 0 and 1
                    new_string_result.append(string_result[i][:c] + "0" + string_result[i][c+1:])
                    new_string_result.append(string_result[i][:c] + "1" + string_result[i][c+1:])
                    break

            if "*" in string_result[i]:    
                # remove original (only once, even if multiple *)
                # print("Removing ", string_result[i])
                new_string_result.remove(string_result[i])

        # print("Kmap result during fix iter: ", new_string_result)

        ast_found = False
        for i in range(len(new_string_result)):
            if "*" in new_string_result[i]:
                ast_found = True

        # print(ast_found)
        ast = ast_found
        string_result = new_string_result
    return string_result

Some comments on this code:

list.remove() is slow. In order to remove an element from a list, it must search through each element of the list, and check if they are equal. Once it finds the match, it has to move every element of the list after that point down one space. Avoid this if possible.
You're iterating over the list with for i in range(len(string_result)):, but the variable i is not used for anything besides string_result[i]. This is usually slower than for elem in string_result:, and more complex.

Test data

Here's the data that I used to test each of these. It is ten thousand random strings of *, 0, or 1, each ten characters long.

data = [''.join(random.choices(['0', '1', '*'], k=10)) for i in range(10000)]
correct_output = [set(kmap_orig([string])) for string in data]

(Note: when checking the correctness of all of the solutions below, I ignored duplicates and order of solutions.)

Iterative solution

This one is about 3x faster, mostly from the removal of the inner for loop.

def kmap_iterative(string_result):
    changed = True
    while changed:
        changed = False
        new_string_result = []
        for string in string_result:
            if "*" in string:
                c = string.find("*")
                assert c != -1
                # replace with 0 and 1
                prefix = string[:c]
                suffix = string[c+1:]
                new_string_result.append(prefix + "0" + suffix)
                new_string_result.append(prefix + "1" + suffix)
                changed = True
            else:
                # no * is present, so add unchanged
                new_string_result.append(string)
        string_result = new_string_result
    return string_result

Other optimizations done:

Rather than find string[:c] twice, do it once and save to a variable.
The code spent a lot of time finding the value of ast_found. It is faster to keep track of whether any modifications were done to the list, and exit if no modifications were made. The downside is that it loops one more time than necessary.
Since the inner loop is essentially just looking for the first instance of * in the string, replace it with a function from the standard library which does the same thing.
Avoid copy.deepcopy() and list.remove() by starting with an empty list, and adding elements from string_result which don't have any asterisks.

Recursive solution

This is about 20% slower than the iterative solution, but it is the shortest and (in my opinion) simplest solution.

def expand_string_recursive(string):
    if "*" not in string:
        yield string
    else:
        c = string.find("*")
        assert c != -1
        # replace with 0 and 1
        prefix = string[:c]
        suffix = string[c+1:]
        yield from expand_string_recursive(prefix + "0" + suffix)
        yield from expand_string_recursive(prefix + "1" + suffix)

def kmap_recursive(string_result):
    ret = []
    for string in string_result:
        ret.extend(expand_string_recursive(string))
    return ret

Itertools solution

The idea behind this solution is that what you're asking for is a Cartesian product of [0, 1], done n times, where n is the number of asterisks. There's something in the standard library which can find this. This is also 20% slower than the iterative solution.

def expand_string_itertools(string):
    blank_positions = [i for i, letter in enumerate(string) if letter == "*"]
    blanks = len(blank_positions)
    # Convert to list to allow mutability
    string = list(string)
    for product in itertools.product(["0", "1"], repeat=blanks):
        for position, replacement in zip(blank_positions, product):
            string[position] = replacement
        # Convert back to string
        yield ''.join(string)

def kmap_itertools(string_result):
    ret = []
    for string in string_result:
        ret.extend(expand_string_itertools(string))
    return ret

Benchmark results

kmap_orig
706 ms ± 5.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
kmap_iterative
201 ms ± 1.06 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
kmap_recursive
251 ms ± 2.33 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
kmap_itertools
243 ms ± 4.74 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

This makes so much sense. I also attempted the recursive method, and found it was slower so I hadn't mentioned it. Thank you so much! — Alenna Spiro, Jun 07 '23 at 19:18