7

What would be a smart data structure to use to represent a Sudoku puzzle? I.e. a 9X9 square where each "cell" contains either a number or a blank.

Special considerations include:

  • Ability to compare across row, column, and in 3X3 "group
  • Ease of implementation (specifically in Python)
  • Efficiency (not paramount)

I suppose in a pinch, a 2D array might work but that seems to be a less than elegant solution. I just would like to know if there's a better data structure.

Rafe Kettler
  • 75,757
  • 21
  • 156
  • 151

4 Answers4

12

Actually, I built such a beast, both a solver and a generator, and I used a 2D array. It worked fine.

You just had to understand the indexes and where they were and that wasn't too difficult to master.

The relative relationships between cells in a row doesn't change depending on the column, same goes for cells in a column, or even cells in a mini-square.

Sometimes, a less "elegant" solution is just fine. Indeed, sometimes, it's preferable :-)


For what it's worth, you may be interested in the algorithms that I used for the solver/generator.

First I wrote the solver part which would first set all cells as being able to be any value then apply all the rules in sequence to see if a individual cell could be solved or otherwise limited, things like:

  • if the cell was a specific value in the clues, set it to that value.
  • if there's only one cell left in a row (or column or mini-square), you can set it to the remaining value.
  • if a cell is marked as being possibly N and N exists in its row/column/mini-square elsewhere, remove that possibility.
  • if there are two cells in the row/column/mini-square and they have the same two possibilities (and no other possibilities), all other cells in that row/column/mini-square should have that possibility removed.

And so on, adding each rule that I use in solving the real puzzles.

For the generator, I started with:

123 456 789
456 789 123
789 123 456

234 567 891
567 891 234
891 234 567

345 678 912
678 912 345
912 345 678

and then, in a loop of varying size (at least 500), proceeded to swap rows and columns in such a way that it would never produce an invalid puzzle. In other words, swap rows or columns with the group they're in (for example, rows 1, 2 and 3 are a group, so are columns 4, 5 and 6).

This shuffled up the cells well enough to produce a decent puzzle.

Then, I started choosing random cells and setting them as unknown. Once a cell was set as unknown, I would pass the whole puzzle into the solver. If it was solvable, I would continue, otherwise I would re-instate the cell and carry on.

This prevented me getting a puzzle that was logically unsolvable.

Once a large number of random cell removals had been done, I would try to remove all the remaining cells in order using the same method. What was left then was the minimum amount of information required to solve the puzzle.

And, so it wasn't a pain to Sudoku beginners, I would allow them to specify a lower difficulty level which would put a certain number of the unnecessary cells back in.

Not a bad scheme, there may be better ones but that one worked fine for me.

Now, if I could only figure out this Kakuro stuff, I could die happy :-)

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • 1
    To the exent you need special checks (say, in a 3x3 group), you can code one against the 2D array just fine. – Ira Baxter Nov 01 '10 at 01:37
9

Read Peter Norvig's essay Solving Every Sudoku Puzzle. You're unlikely to find a more elegant solution and you'll probably learn some new things about data structures, Python, and performance analysis in the process.

Ned Deily
  • 83,389
  • 16
  • 128
  • 151
  • Wow ... just plain wow. I'm not particularly a fan of Sudoku, but I think I need to meditate on Norvig's write-up. I like the fact that he did this to try and cure his wife of her addiction. :-) – Peter Rowell Nov 01 '10 at 05:35
2

Others have reasonably suggested simply using a 2D array.

I note that a 2D array in most language implementations (anything in which that is implemented as "array of array of X" suffers from additional access time overhead (one access to the top level array, a second to the subarray).

I suggest you implement the data structure abstractly as a 2D array (perhaps even continuing to use 2 indexes), but implement the array as single block of 81 cells, indexed classically by i*9+j. This gives you conceptual clarity, and somewhat more efficient implementation, by avoiding that second memory access.

You should be able to hide the 1D array access behind setters and getters that take 2D indexes. If your language has the capability (dunno if this is true for Python), such small methods can be inlined for additional speed.

Ira Baxter
  • 93,541
  • 22
  • 172
  • 341
0

Python doesn't have much in the way of data structures. Your best bet is probably just a regular 2D array or to build your own using classes.

You can read more about python data types here.

Joshkunz
  • 5,575
  • 6
  • 28
  • 24