Puzzle Programming - Impossible to optimize?

Question

I've been writing programs to solve various number puzzles, but I'm constantly designing unreasonably complex search algorithms that I can't optimize.

For example, in one puzzle, you are given a 3x3 grid of numbers 1 through 9 below:

123
456
789

You're allowed to cycle the numbers in any row or column in any direction. Below is an example of shifting the top row of numbers to the right. The numbers will loop if they are at the edge of the grid.

123 -> 312
456    456
789    789

You must move the numbers in this manner until you create a magic square in which the sum of the numbers in each column, row, and diagonal is 15.

I've written a DFS brute-force algorithm to test all possible sequences of moves, though the number of available moves at each turn increases exponentially (approx. 12 ^ [current turn]), rendering it useless.

It seems a BFS would be optimal to find the correct moves, but that would require me to store hundreds if not thousands of copies of the grid in order to backtrack!

I run into these kinds of problems all the time. Both BFS and DFS algorithms use too much memory and time, respectively. I need help optimizing algorithms like these so they run faster and efficiently. Maybe recognizing patterns and relations of the numbers or giving the algorithm logic to work towards a goal would help? (I don't know what that would entail).

EDIT:

My fixed algorithm works like a charm. Learning how to number my permutations was essential. Thank you all!

The transformations you can apply are related to permutation groups in abstract algebra, much in the same way that a Rubik's cube is. I don't know enough math to help move from this observation to anything more complex, but I'd bet that a more mathematically inclined SO-er could help explore the implications. — templatetypedef, Jan 06 '12 at 07:06
For this particular problem, there are a maximum of 9! (362880) different states. You can convert a position into an integer, there is no need to store the position explicitly. A BFS should be able to run pretty much instantly, and it's possible to no more than a few megabytes. — rettvest, Jan 06 '12 at 10:15
@stubbscroll I was considering something of that sort, though I'm not sure how I would convert a 3x3 grid to a unique integer within the 362880 limit without using prime numbers. How would you do that? This solution is promising. — xikkub, Jan 06 '12 at 10:27
@CollinJSimpson When the grid numbers are viewed in 1D, the state is essentially a permutation of 9 digits. Using the "factorial number system" (see wikipedia article) one can convert from permutation to integer (between 0 and n!-1) and vice versa with rank and unrank routines. I'm pretty sure these routines are in Knuth, in either volume 1 or 4A. — rettvest, Jan 06 '12 at 13:21
There is a paper with fast and short implementations of rank/unrank routines: http://webhome.cs.uvic.ca/~ruskey/Publications/RankPerm/MyrvoldRuskey.pdf — rettvest, Jan 06 '12 at 13:46
Is there a guarantee that all of the 9! states are reachable? As an analogy, consider Rubik's cube - some states are not reachable. — Aaron McDaid, Jan 06 '12 at 15:28
I don't know if all states are reachable, but the point of numbering permutations is to store a compressed version of each state. I'm still trying to understand how to do this, though. Is this a good example? [link](http://stackoverflow.com/questions/1506078/fast-permutation-number-permutation-mapping-algorithms) — xikkub, Jan 06 '12 at 20:32

ccoakley · Accepted Answer · 2012-01-07T02:57:49.280

9

I'd suggest looking up memoization (caching the results of a function call based on the inputs so that the function is not recomputed for identical subsequent calls). Having understood memoization, I'd look up dynamic programming (still saving the results of the function, but also reordering computations to eliminate unnecessary calls). Some explanations of dynamic programming use a recursive definition of fibonacci, then fibonacci + memoization, and finish with computation reordering.

For DFS and BFS problems in general, the technique known as Branch and Bound might be of interest. The bounding part can give you substantial gains in some problems. Trimming a subtree one generation higher than with a less sophisticated bound eliminates many new branches in your search tree (alternate wording: since trees grow exponentially with depth, pruning your search early is important).

For your particular problem, I believe that optimizations are possible.

First, let's consider the DFS. I believe that all permutations of your board are reachable from any configuration of the board. As a consequence. DFS can be implemented without backtracking (though I'm guessing you knew that). Depth only search? (EDIT: per Daniel Fischer, this is wrong. Half of states are reachable, though it doesn't impact the no-backtracking statement since backtracking won't help you reach your unreachable states)

But, you might find that you don't want to move through many permutations simply to find that you haven't yet solved the problem. Backtracking might actually help. Or...

Consider looking at your end goal. Magic squares have some particular properties that you might exploit to choose your operations more carefully. For example, since the rows and columns must sum to 15, you know 9, 8, and 7 cannot share a row or a column with each other. Neither can 9 and 6. 6 can go with 8 and 1 or 7 and 2. 6 cannot share a column/row with 5 and 4 even though they sum to 15 because of the pigeon-hole principle (each row/column contains either 9, 8, or 7). In fact, you might find that your problem has a unique solution, modulo some sort of cyclic permutation in all-rows, all-columns, reflection, and transposition. The diagonal requirement further constrains the valid solutions.

Aside: the logic used in the previous paragraph is not unlike constraint-based-programming. It's not really an optimization technique (though it might be considered an optimization on implementation time if not run time), but might be of interest to you as well (also note that magic squares and sudoku are frequently used to illustrate constraint-based programming).

Now you have a goal:

Describe a solution state.
Reach one of the known solution states with the fewest moves.

This is a fundamentally different approach than searching the various permutations until the problem is solved. I'd try to find a dynamic programming solution. For a slightly easier dynamic programming problem that moves from a start state to a goal state with incremental operations, take a look at the Levenshtein edit distance problem.

edited Jan 07 '12 at 02:57

answered Jan 06 '12 at 07:07

ccoakley

3,235
15
12

@CollinJSimpson: Reading your comments on the other answers, I realize that you are far more knowledgeable than I assumed when I started typing my answer. My bad. – ccoakley Jan 06 '12 at 07:16
I assure you that I don't have the knowledge you think I do. I have limited experience with the various optimization techniques addressed here. The information in your post is extraordinarily helpful, and I thank you for the generous assistance. I believe I now have more than enough information to implement suitable optimizations. Thank you! – xikkub Jan 06 '12 at 07:24
I have a few questions. In order to avoid duplicate grid exploration, would I cache the results of previous function calls by storing entire grid states in a lookup table? Would it be a better idea to store hashed grids instead of copies? I don't believe it's possible for me to cache/associate my outputs with specific inputs as mentioned, as that requires a lookup table (parameters aren't static). I don't fully understand B&B, but how applicable is it to this particular puzzle? Also, because cells are being constantly jumbled around, would a Levenshtein approach be feasible? – xikkub Jan 06 '12 at 10:06
Well, what do you mean by a "hashed grid"? In your example, 123456789 is a 4-byte int that uniquely defines the grid simply by pulling the numbers in left-to-right reading order. If you want row/column + direction, you might encode your first operation as 12345678900, though that sucks because it is no longer guaranteed to fit in 4 bytes. – ccoakley Jan 06 '12 at 15:12
Branch and bound is more applicable to problems where solutions are built up incrementally than problems like your example. If you have a number puzzle where you place a set of numbers into a grid, you might use properties of the remaining numbers to bound your search. It's useful for NP-complete optimization problems, where your worst case is going to be exponential runtime anyway. – ccoakley Jan 06 '12 at 15:20
If you can define a distance from your goal via means of simple operations, a Levenshtein-like approach is probably feasible. I have a hard time visualizing permutation operations, though. You might be better off looking at some of the rubik's cube algorithms (which in general are more complicated, but at least include the idea of cycling rows). I know that rubik's cube can be solved with dynamic programming, but note that most efficient solutions are a bit more custom. – ccoakley Jan 06 '12 at 15:32
And one last bit: Don't expect optimizing your algorithms to be easy. It's a pain in the butt and frequently frustrating. My last puzzle program was a sudoku solver. The DFS was unacceptably slow, even when the moves were scored and placed in a priority queue. I developed a much more efficient algorithm. The algorithm was quite complex, and I screwed up the implementation. Instead of a correct-but-slow algorithm, I now have a very convoluted infinite loop. On the plus side, I now have a great time-sink when I want to procrastinate. – ccoakley Jan 06 '12 at 16:16
If I understand correctly, storing grid states as integers by pulling numbers in explicit left-to-right order demands a larger range of values in the integer datatype. (max 987,654,321). By using a factorial number system, we can reduce the required number of values to 362,880. Because permutation numbers generated in this fashion are contiguous, a very small and efficient array can be allocated to store the grids. Using left-to-order means that your values are disjoint, forcing you to choose between a bloated array or an inefficient hashtable, neither of which being optimal. – xikkub Jan 06 '12 at 22:19
@CollinJSimpson: Sorry, the purpose was just to make you think about your encoding for your data, not to suggest that my way was the best way. Again, you seem to have put a lot more thought into this than I assumed. – ccoakley Jan 06 '12 at 22:49

score 4 · Answer 2 · answered Jan 06 '12 at 15:22

A few remarks in addition to ccoakley's nice answer and stubbscroll's comment, concerning the specific example and a few general principles.

Regarding stubbscroll's remark that this particular problem has only 9! = 362880 different states:
One (fairly easy) way to encode permutations as numbers is indexing the permutations by lexicographic ordering. For example

0 1 2 3  -->  0
0 1 3 2  -->  1
0 2 1 3  -->  2
...
1 0 2 3  -->  6
1 0 3 2  -->  7
...
3 1 2 0  --> 21
3 2 0 1  --> 22
3 2 1 0  --> 23

The trick is writing the index in factorial base,

n = a_1 * 1! + a_2 * 2! + a_3 * 3! + a_4 * 4! + ...

where 0 <= a_k <= k. If you have s symbols, the indices range from 0 to s!-1, so you have s-1 coefficients in the factorial-base expansion of n, (a_1,a_2,...,a_(s-1)). The permutation with index n is then found as follows

 for i = 1 to s-1
    the i-th symbol becomes the (a_(s-i)+1)-th smallest unused symbol
 the last symbol is the left over one

Since that's not particularly clear, an example. Say we look for the permutation with index 4231 of {1,2,3,4,5,6,7,8}. First we expand 4231 in factorial base

4231 = 1 + 2*2115 :  a_1 = 1
2115 = 0 + 3* 705 :  a_2 = 0
 705 = 1 + 4* 176 :  a_3 = 1
 176 = 1 + 5*  35 :  a_4 = 1
  35 = 5 + 6*   5 :  a_5 = 5
   5 = 5 + 7*   0 :  a_6 = 5

all further coefficients (here just a_7) are 0. It's better to follow writing the a_i in reverse order, (a_7,a_6,...a_1), so

 coefficients      symbols       choice
0,5,5,1,1,0,1  1,2,3,4,5,6,7,8     1
 5,5,1,1,0,1    2,3,4,5,6,7,8      7
  5,1,1,0,1      2,3,4,5,6,8       8
   1,1,0,1        2,3,4,5,6        3
    1,0,1          2,4,5,6         4
     0,1            2,5,6          2
      1              5,6           6
      -               5            5

Result: 17834265.

Find the index of 246351:

symbols     count     perm    index(head)
1,2,3,4,5,6   6   2,4,6,3,5,1    1         a_5 = 1
 1,3,4,5,6    5    4,6,3,5,1     2         a_4 = 2
  1,3,5,6     4     6,3,5,1      3         a_3 = 3
   1,3,5      3      3,5,1       1         a_2 = 1
    1,5       2       5,1        1         a_1 = 1

index is `1*5! + 2*4! + 3*3! + 1*2! + 1*1! = 187.

So now we have a fairly simple way of converting between permutations and their indices. The conversion isn't super fast (O(s^2)), but you get easy and fast comparison and lookup (have I seen the state before?). Whether it's a gain remains to be decided in each case.

Now, for the particular case at hand, we have some further restrictions reducing the search space.

Each move is a cyclic permutation of three elements, thus an even permutation.

Hence all combinations of such moves are also even permutations, meaning half of the possible states are unreachable. We are left with (at most) 9!/2 = 181440 reachable states. Indexing even permutations by lexicographic ordering is only slightly more complicated. The crucial point is that a permutation is even if and only if the sum of the coefficients a_k in the factorial-base expansion of its index is even.

Reduce the search space using constraints and symmetries. If you're employing a search strategy using a structure with all possible states, this will reduce the memory requirements by a corresponding factor. If your search strategy only touches reachable states, the constraints don't reduce the number of steps, but they can still speed up the search due to the lower memory footprint. The use of symmetries can reduce the number of steps by identifying equivalent states.

In the example problem, we have the further nice situation that the 5 is already in the correct place, and that an optimal solution doesn't move it ever. So we need only consider even permutations of 8 symbols, reducing the search space to 8!/2 = 20160 possible states. (Though that is not obvious.)

In general, however, it is difficult to prove that an optimal solution never leaves a particular subset of the possible states, so you can rarely directly impose such a restriction to your search.
But it is often the case that you can find a good solution of the problem using such a restriction and then use the good solution to prune early in your search for the optimal solution in the unrestricted search space.

A variant one can often use is finding an approximation by a greedy strategy and using this as a bound to prune early in an exhaustive search.

I just implemented this feature which works great, though your explanation was way over my head. I'm using a DFS which finds a solution in 17426 moves, though I'll convert it to a BFS now that I have state caching. Is there any method of moving from number state to state entirely without the use of a 3x3 grid? I currently require a physical grid to apply moves to, but applying row/column rotations to the permutation numbers mathematically seems more efficient. I might also try generating all possible magic square permutation numbers beforehand to avoid checking solutions algorithmically. — xikkub, Jan 06 '12 at 21:53
I can't think of an easy way to transition between states without decoding them to the permutation. If you're making a lot of transitions, a transition table `num_states×num_transitions` could be worthwhile, but I think here it's faster to de- and encode for each move. Generating the magic squares beforehand might be a good idea, you can then try a meet-in-the-middle approach. — Daniel Fischer, Jan 06 '12 at 23:33
Good idea. The DFS with caching worked almost instantly, so I shan't need many more improvements. — xikkub, Jan 07 '12 at 02:11
Shoot. I didn't think about moves being even permutations and therefore half of the states being unreachable. On the plus side, since he's only shooting for reasonable speedup and not the optimal solution, it means he really only needs to consider a single solution and its reflection as end goals. Screw rotations. — ccoakley, Jan 07 '12 at 02:55
Does this mean that I will never encounter an odd permutation number? If so, does this enable me to divide each permutation number by 2 to maintain number contiguity? — xikkub, Jan 07 '12 at 03:02
All permutations can be written as a product (or composition) of transpositions (a transposition is a permutation that swaps two elements and leaves all others fixed). That product representation is not unique, but the parity of the number of factors is. A permutation is odd if it can be written as a product of an odd number of transpositions and it is even if it can be written as the product of an even number of transpositions. A cycle is an odd permutation if it rotates an even number of elements and even if it rotates an odd number of elements (yeah, confusing). — Daniel Fischer, Jan 07 '12 at 03:14
@Collin The index of a permutation wrt lexicographic ordering doesn't tell you whether it's odd or even. However, of the two permutations with index 2k and 2k+1, one is always odd and one even, so you can divide the index by 2 to store them. To determine the permutation from the divided index, you then have to find the coefficient a_1 in the factorial base expansion, as mentioned above, to get an even permutation, the sum of the coefficients a_k must be even. — Daniel Fischer, Jan 07 '12 at 03:22
I've just read up on even/odd permutations. Do you mean to say that the two permutations with indexes 2k and 2k+1 are identical in terms of how they are stored? If I have evaluated the permutation with index 2k, does that mean I can ignore the one with index 2k+1 in future evaluations? — xikkub, Jan 07 '12 at 04:05
No, here odd permutations are unreachable, so we needn't store them at all. — Daniel Fischer, Jan 07 '12 at 04:16
The concept is still somewhat fuzzy to me. I allocate an array of size 9! to keep track of grid states. However, I only require half that size if odd permutations are to be excluded. Is it possible for me to reduce the size of my array to 9!/2 and maintain functionality? — xikkub, Jan 07 '12 at 04:30
Yes, that's the point. Of course, 9! isn't so large, the few 100 KB wasted would hardly hurt and simplify indexing etc. — Daniel Fischer, Jan 07 '12 at 04:34
I'll skip the indexing optimization at your recommendation. Regarding your greedy strategy, wouldn't localized analysis be minimally effective due to grid state variability? I would expect factoring in the move weights of all cells to balance each other out and yield no useful decisions. — xikkub, Jan 07 '12 at 04:43

score 3 · Answer 3 · answered Jan 06 '12 at 06:59

If the question is how to use row and column rotations to generate a 3x3 magic square, you should probably start with known solutions to generating a 3x3 magic square (or this animated one). You can also simply eliminate certain classes of rotations, such as those that rotate the center row or column.

In fact, there is a solution that will only require 4 rotations.

In cases where a DFS or BFS results in an exponential search space, you usually get a big win by exploiting the structure of the problem. In the magic square case, it's knowing that you can't rotate the middle row or column to get a valid answer.

That's a very interesting observation. That will come in handy. Thank you. — xikkub, Jan 06 '12 at 07:25

score 1 · Answer 4 · answered Jan 06 '12 at 06:06

1

Try using A*. You can use a heuristic of the Manhattan distance from where the numbers are to where they should be. I'm assuming you already have a magic square in mind as the goal state.

answered Jan 06 '12 at 06:06

emschorsch

1,619
3
19
33

I have no specific magic square as the goal state, which might affect an A* algorithm. Wouldn't Manhattan distances be difficult to utilize on such a small grid and with such movement restrictions (including numbers looping around borders)? Tiles cannot be moved individually, and cells in favorable positions must often be moved to accommodate other cells because of this. – xikkub Jan 06 '12 at 06:42

score 1 · Answer 5 · answered Jan 06 '12 at 06:07

1

There are no general answers to this overall question. There are specific answers to specific cases -- but there are also specific cases where it can be mathematically proved that there is no answer that's significantly better than what the brute-force algorithm requires.

There are also many cases where the question of finding the best algorithm is an active research problem, and very smart people are working on it with little success.

You might find reading about "NP completeness" interesting, as it's just a small corner of this problem, but a well-studied one.

answered Jan 06 '12 at 06:07

Brooks Moses

9,267
2
33
57

I understand the possible NP completeness of my question, but my algorithm in particular can obviously be optimized, regardless of the puzzle's computational complexity. Understanding NP completeness might be more effective for significantly larger grids. I'm glad to hear such a problem isn't trivial, though! – xikkub Jan 06 '12 at 06:27
In a depth-first search of this puzzle, it's possible to encounter repeated loops (for example, endlessly cycling the top row to no effect). I know these cases can be pruned, but the only way I know how is to make copies of previous grid states to avoid processing the same grid state multiple times, which requires loads of memory. Alternatively, might it be reasonable to design a breadth-first search to keep track of the explored moves and test them in order from the initial grid setup every time? It's computationally-intensive but it avoids memory storage issues. – xikkub Jan 06 '12 at 07:00

Puzzle Programming - Impossible to optimize?

5 Answers5