0

So I am trying to figure out why filling sudoku board in order of how I generate fields is many times faster then filling it which shuffled order, and I can't come to any reasonable conclusions.

Speed of filling and checking should be about the same no matter if fields are shuffled or not, only noticeable difference is the fact that filling boxes in a shuffled order causes the main loop to easily go for 60.000.000+ iterations not sure how many on average as I am not patient enough to check. And filling them in order of how they were generated gets finished in usually < 1.000 iterations rarely going over it and basically never going over 5.000.

I do understand that code doesn't follow C# standards too strictly but it's a project I am supposed to make and it's not my main language there are probably some possible optimizations but it's not what I am interested in all i want to know why shuffling the order of fill the boxes causes the process to take this much longer.

Edit: forgot to attach order of creating/filling boxes order of filling/creating boxes:

img

Edit2: more concise rephrased question:

The question is why generating an valid fully filled sudoku board in order from image creates a lot less invalid boards from which there is required backtracking, as compared to generating a board in a random order of choosing fields?

example of random order

Here is the code used for the project https://github.com/Piterm21/sudoku

And here are all the parts used in generation.

filledBox class:

public class filledBox
{
    public int textBoxIndex;
    public int value;
    public int nextIndexToTry;
    public int[] values;
    public int lineIndex;
    public int columnIndex;
    public int groupIndex;
}

Here is a main loop:

filledBox[] boxes = this.shuffleBoxesCreateCheckingLists();

long iter = 0;
int nextIndexToFill = 0;
while (nextIndexToFill != 81) {
    for (; boxes[nextIndexToFill].nextIndexToTry < 9; boxes[nextIndexToFill].nextIndexToTry++) {
        // check if is valid
        if (!createsError(boxes[nextIndexToFill])) {
            boxes[nextIndexToFill].value = boxes[nextIndexToFill].values[boxes[nextIndexToFill].nextIndexToTry];
            nextIndexToFill++;
            break;
        }

        if (boxes[nextIndexToFill].nextIndexToTry == 8) {
            boxes[nextIndexToFill].nextIndexToTry = 0;
            boxes[nextIndexToFill].value = 0;
            nextIndexToFill--;
            boxes[nextIndexToFill].nextIndexToTry++;

            while (boxes[nextIndexToFill].nextIndexToTry == 9) {
                boxes[nextIndexToFill].nextIndexToTry = 0;
                boxes[nextIndexToFill].value = 0;
                nextIndexToFill--;
                boxes[nextIndexToFill].nextIndexToTry++;
            }

            break;
        }
    }

    iter++;
}

System.Diagnostics.Debug.WriteLine(iter);

Generation of boxes with setting of lists used for checking for errors:

List<filledBox>[] generationLines;
List<filledBox>[] generationColumns;
List<filledBox>[] generationGroups;

public filledBox[] shuffleBoxesCreateCheckingLists()
{
    filledBox[] boxes = new filledBox[81];

    for (int i = 0; i < 81; i++) {
        boxes[i] = new filledBox();
        boxes[i].values = new int[9]{ 1, 2, 3, 4, 5, 6, 7, 8, 9 };
        boxes[i].values = shuffle(boxes[i].values, true);
    }

    generationLines = new List<filledBox>[9];
    generationColumns = new List<filledBox>[9];
    generationGroups = new List<filledBox>[9];

    for (int i = 0; i < 9; i++) {
        generationLines[i] = new List<filledBox>();
        generationColumns[i] = new List<filledBox>();
        generationGroups[i] = new List<filledBox>();
    }

    int boxIndex = 0;
    for (int y = 0; y < 3; y++) {
        for (int x = 0; x < 3; x++) {
            for (int innerY = 0; innerY < 3; innerY++) {
                for (int innerX = 0; innerX < 3; innerX++) {
                    int subPanelIndex = x + y * 3;
                    int lineIndex = innerY + y * 3;
                    int columnIndex = innerX + x * 3;

                    boxes[boxIndex].textBoxIndex = boxIndex;
                    boxes[boxIndex].groupIndex = subPanelIndex;
                    boxes[boxIndex].columnIndex = columnIndex;
                    boxes[boxIndex].lineIndex = lineIndex;
                    boxes[boxIndex].nextIndexToTry = 0;
                    boxes[boxIndex].value = 0;

                    generationLines[lineIndex].Add(boxes[boxIndex]);
                    generationColumns[columnIndex].Add(boxes[boxIndex]);
                    generationGroups[subPanelIndex].Add(boxes[boxIndex]);
                    boxIndex++;
                }
            }
        }
    }

#if !fast
    boxes = shuffle(boxes);
#endif

    return boxes;
}

Shuffling code:

public T[] shuffle<T> (T[] array)
{
    int i = array.Length;

    while (i > 1) {
        i--;

        int j = rnd.Next(0, i - 1);
        T temp = array[i];
        array[i] = array[j];
        array[j] = temp;
    }

    return array;
}

And code responsible for checking for errors:

    public bool hasError (List<filledBox> list, filledBox box)
    {
        bool result = false;

        foreach (filledBox filledBoxToCheck in list) {
            if (filledBoxToCheck.value == box.values[box.nextIndexToTry]) {
                if (box.lineIndex != filledBoxToCheck.lineIndex ||
                    box.columnIndex != filledBoxToCheck.columnIndex) {
                    result = true;
                    break;
                }
            }
        }

        return result;
    }

    public bool createsError (filledBox box)
    {
        bool result = false;

        if (hasError(generationGroups[box.groupIndex], box)) {
            result = true;
        } else if (hasError(generationLines[box.lineIndex], box)) {
            result = true;
        } else if (hasError(generationColumns[box.columnIndex], box)) {
            result = true;
        }

        return result;
    }
piterm21
  • 11
  • 1
  • what is the question? What do you mean by filling board? (solving perhaps?) and with what method ... – Spektre Jan 16 '18 at 16:10
  • Question is why does solving the board in order presented in attached image is much faster then solving it in random order? Yes by filling I mean creating valid fully solved sudoku board. And as to the method I believe it's the most common method of creating solved sudoku board by trying next not checked value for given field and backtracking if there is none left. I added the image as I forgot it the first time. – piterm21 Jan 16 '18 at 16:16
  • because Sequentioal memory access is mostly faster than random access on common HW in general. Also see [Why is copying a shuffled list much slower?](https://stackoverflow.com/a/42108043/2521214) not the QA I was looking for to link but still looks relevant This is the one [Why is it faster to process a sorted array than an unsorted array?](https://stackoverflow.com/q/11227809/2521214) – Spektre Jan 16 '18 at 16:55
  • It's not an issue with memory access as I don't measure speed in real time I measure speed by comparing amount of iterations of main loop needed to generate fully filled valid board. Only actions preformed in a main loop are checking validity of new value setting the value and backtracking if needed. Maybe I have phrased the question wrong let's try again. The question is why generating an valid fully filled sudoku board in order from image creates a lot less invalid boards from which there is required backtracking, as compared to generating a board in a random order of choosing fields? – piterm21 Jan 16 '18 at 17:24

1 Answers1

0

The reason the shuffled order of visiting boxes is worse is that, for the first few boxes, you've increased the odds of filling them in independently. I'm calling a 3x3 region a box. You could visit each little square randomly, but I'll confine my analysis to fillng in one 3x3 box at a time. It should illustrate the problem.

Lets assume, whether you used the in-order or random method, you filled in the top left box first.

VISITING BOXES IN ORDER: Now consider filling in the box next to it, that is, the top middle box. Just look at filling in the top row of that box, there are 6x5x4 = 120 ways to fill it in.

VISITING BOXES RANDOMLY: Now consider instead choosing any box to fill in next. If you happen to choose a box that is not in the top row or left column, filling it in is unaffected by the way the first box was filled. The top row of that box can be filled in 9x8x7 = 504 ways.

Taking this further, if you fill in the boxes sequentially, say, across the top, the third box (the top right one), begins with only 3x2x1 = 6 ways to fill in the top row of it. Whereas, if you select the next box randomly, you might, at worst, select a box that is not in the same row or column of either of the first 2 boxes, which means that the top row of that box has yet again 9x8x7 = 504 ways of being filled in.

If you tried randomly selecting the order for each little square to be filled in, the first few might all be in different rows and columns. At worst, none of the first 9 squares will align meaning there will be 9^9 = 387,420,489 options. But filling them in across the top row, the choices are guaranteed to be 9! = 362,880.

This illustrates one of the strategies for how to approach backtracking. Ideally, you first solve parts of the problem that most tightly constrain the rest of the problem. For some problems sequential is better, and for some random is better.

davidg
  • 21
  • 2