Why does arbitrary selection of next cell affect performance in backtracking Sudoku solver?

Question

I've made a simple Sudoku class (in Python). Its main attributes are the following:

def __init__(self, it: iter = None):
    self.grid = [[None for _ in range(9)] for _ in range(9)]
    self.usedInRow = [[False for _ in range(10)] for _ in range(9)]
    self.usedInCol = [[False for _ in range(10)] for _ in range(9)]
    self.usedInBox = [[False for _ in range(10)] for _ in range(9)]

def writeCell(self, pos: tuple, val: int) -> bool: [...]

def eraseCell(self, pos: tuple): [...]

self.grid keeps track of the actual numbers in the Sudoku. self.usedInRow, self.usedInCol, and self.usedInBox are boolean arrays that keep track of which numbers have been used in each row/column/square. writeCell is a method that tries to write a number in an empty cell and returns whether it succeeded (according to the rules of Sudoku). Finally, eraseCell replaces a cell's content with None.

Inside the class I've written a recursive-backtracking solver method. When implementing the solver, I tried different approaches for selecting which cell to fill in next. The first thing I tried was simply starting from (0, 0) and incrementing the position left-to-right and top-to-bottom (with columns as the 'small' unit and rows as the 'big' unit).

But that's not general enough (if for some reason the solver started solving from a position other than (0, 0) every empty cell before that would be skipped). So I added a member to keep track of empty cells:

def __init__(self, it: iter = None):
    [...]
    self.emptyCells = [(i, j) for i in range(9) for j in range(9)]

I modified the writeCell and eraseCell methods to mantain the correct record. At this point the solver looks more or less like this:

def solve(self) -> bool:
    [...]

    def _solve() -> bool:
        if not self.emptyCells: return True
        pos = self.emptyCells[0]

        for n in range(1, 10):
            if self.writeCell(pos, n):
                if _solve(): return True
                self.eraseCell(pos)
        else: return False

    success = _solve()
    [...]
    return success

At this point I thought that maybe it would be better if the self.emptyCells member was a set instead of a list: I didn't want duplicates, didn't need indexed access (I thought that it wouldn't matter the order in which cells were filled in), and wanted fast deletion. So I modified the self.emptyCells attribute

self.emptyCells = set((i, j) for i in range(9) for j in range(9))

and the methods that referred to it (for example I used .discard() instead of .remove() in writeCell). The solver method now looked like this:

def solve(self) -> bool:
    [...]

    def _solve() -> bool:
        if not self.emptyCells: return True
        pos = self.emptyCells.pop()  # Get arbitrary position

        for n in range(1, 10):
            if self.writeCell(pos, n):
                if _solve(): return True
                self.eraseCell(pos)
        else: 
            self.emptyCells.add(pos)  # Backtrack
            return False

    success = _solve()
    [...]
    return success

To my surprise, the second approach (using a set) took about one minute to solve an empty Sudoku, whilst the first approach (using a list) took about two seconds. Why is that? Does the arbitrarily chosen position conferred by set.pop() affect the efficiency of the algorithm, compared to filling cells in order?

Full code

Using a list

import queue
import threading
import colorama

colorama.init()  # For ANSI escape codes


class Sudoku:
    @staticmethod
    def isSudoku(S: iter):
        return len(S) == 9 \
               and all(len(row) == 9 for row in S) \
               and all(num in range(1, 10) for row in S for num in row)
    
    def __init__(self, it: iter = None):
        self.grid = [[None for _ in range(9)] for _ in range(9)]
        self.usedInRow = [[False for _ in range(10)] for _ in range(9)]
        self.usedInCol = [[False for _ in range(10)] for _ in range(9)]
        self.usedInBox = [[False for _ in range(10)] for _ in range(9)]
        self.emptyCells = [(i, j) for i in range(9) for j in range(9)]
    
        if it is not None:
            for i in range(9):
                for j, num in zip(range(9), it):
                    if num is not None:
                        if not self.writeCell((i, j), num): raise ValueError("Invalid Sudoku")
    
    def __iter__(self):
        return iter(self.grid)
    
    def __len__(self):
        return 9
    
    def __str__(self):
        def numsToStr(row):
            for num in row: yield num if num is not None else ' '
    
        def rowsBlock(i):
            yield ''
            for row in self.grid[3 * i:3 * (i + 1)]:
                yield '|{} {} {}|{} {} {}|{} {} {}|'.format(*numsToStr(row))
            yield ''
    
        def blocks():
            yield ''
            for i in range(3): yield '\n'.join(rowsBlock(i))
            yield ''
    
        return ('-' * 19).join(blocks())
    
    def __getitem__(self, key: tuple):
        if not len(key) == 2: raise TypeError("Two indices needed")
        r, c = key
        return self.grid[r][c]
    
    def __setitem__(self, key: tuple, value: int):
        if not len(key) == 2: raise TypeError("Two indices needed")
        self.eraseCell(key)
        self.writeCell(key, value)
    
    def _canWrite(self, r: int, c: int, b: int, val: int) -> bool:
        if self.usedInRow[r][val] or self.usedInCol[c][val] or self.usedInBox[b][val] \
            or self.grid[r][c] is not None: return False
        self.usedInRow[r][val] = self.usedInCol[c][val] = self.usedInBox[b][val] = True
        return True
    
    def writeCell(self, pos: tuple, val: int) -> bool:
        if val is None: return True  # Ignore writing none
        r, c = pos
        b = 3*(r//3) + c//3
    
        if not self._canWrite(r, c, b, val): return False
        # noinspection PyTypeChecker
        self.grid[r][c] = val
        self.emptyCells.remove(pos)
        return True
    
    def eraseCell(self, pos: tuple):
        r, c = pos
        b = 3*(r//3) + c//3
    
        val = self.grid[r][c]  # Get old value for reference
        if val is None: return  # If there wasn't anything in the first place
        self.grid[r][c] = None  # Actually erase
        self.emptyCells.append(pos)
        # noinspection PyTypeChecker
        self.usedInRow[r][val] = self.usedInCol[c][val] = self.usedInBox[b][val] = False
    
    def solve(self) -> bool:
        printQueue = queue.PriorityQueue()
    
        def printer():
            while True:
                priority, item = printQueue.get()
                if item is StopAsyncIteration:
                    break
                print(item)
                print('\x1b[13A', end='')
                printQueue.task_done()
    
        printThread = threading.Thread(target=printer, name="PrintThread")
        printThread.start()
    
        def _solve() -> bool:
            if not self.emptyCells: return True
            pos = self.emptyCells[0]
    
            for n in range(1, 10):
                if self.writeCell(pos, n):
                    if _solve(): return True
                    printQueue.put((1, self))
                    self.eraseCell(pos)
            else:
                return False
    
        success = _solve()
        printQueue.put((0, StopAsyncIteration))
        printThread.join()
        return success

Using a set

The only things that change are the following.

_solve() internal method:

    def _solve():
        if not self.emptyCells: return True
        pos = self.emptyCells.pop()

        for n in range(1, 10):
            if self.writeCell(pos, n):
                if _solve(): return True
                printQueue.put((1, self))
                self.eraseCell(pos)
        else:
            self.emptyCells.add(pos)
            return False

eraseCell and writeCell:

def writeCell(self, pos: tuple, val: int):
    if val is None: return True  # Ignore writing none
    r, c = pos
    b = 3*(r//3) + c//3

    if not self._canWrite(r, c, b, val): return False
    # noinspection PyTypeChecker
    self.grid[r][c] = val
    self.emptyCells.discard(pos)
    return True

def eraseCell(self, pos: tuple):
    r, c = pos
    b = 3*(r//3) + c//3

    val = self.grid[r][c]  # Get old value for reference
    if val is None: return  # If there wasn't anything in the first place
    self.grid[r][c] = None  # Actually erase
    self.emptyCells.add(pos)
    # noinspection PyTypeChecker
    self.usedInRow[r][val] = self.usedInCol[c][val] = self.usedInBox[b][val] = False

Note

The reason I don't put back pos in the list version—and also why I get pos as self.emptyCells[0] instead of self.emptyCells.pop(0)—is because writeCell and eraseCell already take care of that: if writeCell succeeds in writing a (previously empty) cell, then it will call self.emptyCells.remove(pos)—and here's another reason why I'm not popping the list: otherwise I would be removeing a non-existent element in the list, and would have to catch the ValueError.

After that, if backtracking occurs (i.e. the _solve() recursive call returns False), eraseCell will put pos back in the list. Thus, if the for loop "fails"—i.e., the else branch is entered—, the last eraseCell will already have put pos back.

On the other hand, when using a set, there is no way of accessing an element without removing it from the set (to my knowledge). Thus, I'm forced to pop from it and put it back in manually. That's why discard is used instead of remove in the set-version of writeCell, since pos may already have been popped by solve_.

An alternative maybe would be to save pos = self.emptyCells.pop() and add it straight back (self.emptyCells.add(pos)), and then proceed with solve_. Then I could change discard to remove.

Usage example

sudoku = Sudoku()
sudoku.solve()
print(sudoku)

Profiling results

I profiled both versions but didn't find any obvious culprit—that is, the proportions of cumulative time taken by each method were very similar, although the actual absolute time they took was just scaled up in the set version with respect to the list version.

Update — `pos` counts

I added an array to keep track of how many times each pos was selected to be explored, and I printed it at the end of the solve. Here are the results.

List version:

[[   1    1    1    1    1    1    1    1    1]
 [   1    1    1    1  145  671 7169 9869 7606]
 [ 774 1115  645    7 2066 1240  451  463  554]
 [ 511  466  504  424  554  422  358  502  566]
 [ 545  224  507  539  578  404  487  357  574]
 [ 421  421  452  436  465  406  326  621  483]
 [ 436  519  284  462  545  534  410  563  730]
 [ 433  561  311  397  365  254  308  708  384]
 [ 445  559  558  413  513  572  564  511  686]]

Set version (as far as I can tell, results are consistent):

[[ 19895      1  91239  66620      1      1  76794      1      1]
 [     1      1  77617  43061  95435      1      1  49617  77772]
 [ 54441      1      1  96081  79660      1      1      1  89575]
 [101233      1      1  85645  55172      1      1      1  54613]
 [     1      1  71512  61978   5531      1  76794  82692  74311]
 [ 96268      1  57136  92052      1      1  86104  91911      1]
 [     1  98119  95706  10680      1  97232  67210      1  39094]
 [ 53447      1      1      1  89846      1      1  88072      1]
 [ 74468  19103      1  39934  75698      1      1  90778  53260]]

Nice code! Have you tried profiling, e.g. `cProfile`, to see where the performance boost takes place? — Dascienz, Jan 04 '19 at 17:25
@Dascienz Thanks! I've tried, but found no significant results. See the newly added last paragraph. — Anakhand, Jan 04 '19 at 19:11
Perhaps this answer would help you understand what's going on then? https://stackoverflow.com/questions/8929284/what-makes-sets-faster-than-lists — Dascienz, Jan 04 '19 at 19:20
@Dascienz Good resource! But the problem is exactly that. It's not the `set` that's faster than the list. It's the other way around, and not by little! Which is surprising... I guess it doesn't have to do with the data structure itself but rather the way `set.pop()` retrieves items (arbitrary instead of consistent). — Anakhand, Jan 04 '19 at 19:23
Have you tried printing `pos` values during execution to see what happens during `_solve`? I only ask because I noticed in your `list` implementation you don't add your `pos` value back into `self.emptyCells` on your `else` condition, but you do so in the `set` method? Could be I just don't fully understand your algorithm after my cursory glance though! — Dascienz, Jan 04 '19 at 19:31
@Dascienz I failed to fully explain that. Basically, the reason I don't put back `pos` in the `list` version—and also why I get `pos` as `self.emptyCells[0]` instead of `self.emptyCells.pop(0)`—is because `writeCell` and `eraseCell` already take care of that: if `writeCell` succeeds in writing a (previously empty) cell, then it will call `self.emptyCells.remove(pos)`—and here's another reason why I'm not `pop`ping the list: otherwise I would be `remove`ing a nonexistant element in the list, and would have to catch the `ValueError`. — Anakhand, Jan 05 '19 at 12:20
@Dascienz After that, if backtracking occurs (i.e. the `_solve()` recursive call returns `False`), `eraseCell` will put `pos` back in the list. Thus, if the `for` loop "fails"—i.e., the `else` branch is entered—, the last `eraseCell` will already have put `pos` back. On the other hand when using a `set`, there is no way of accessing an element without removing it from the set (to my knowledge). Thus, I'm forced to `pop` from it and put it in back manually. That's why `discard` is used instead of `remove` in the `set`-version of `writeCell`, since `pos` may already have been popped by `solve_` — Anakhand, Jan 05 '19 at 12:26
Did you count how may times the list vs the set function runs on the same sudoku? I suspect that the randomized version keeps adding and popping the same element several times. And therefore trying several paths more than once. — swenzel, Jan 05 '19 at 14:51
@swenzel raises a very valid point. You may want to evaluate which elements are being popped and added back in to see if the set implementation is looping on redundant elements. — Dascienz, Jan 07 '19 at 22:45
@swenzel It is probably that; see the update. Although it is interesting that the `set` version has more fixed points—i.e. cells that are explored only once during the solve—than the list version, whose fixed points are, ase expected, at the beginning of the grid, in sequence. — Anakhand, Jan 12 '19 at 13:28