0

I’m trying to teach myself minimax in javscript from the code here: https://github.com/beaucarnes/fcc-project-tutorials/blob/master/tictactoe/7/script.js

And video: https://youtu.be/P2TcQ3h0ipQ?t=2334

This is the function:

function minimax(newBoard, player) {
    var availSpots = emptySquares();

    if (checkWin(newBoard, huPlayer)) {
        return {score: -10};
    } else if (checkWin(newBoard, aiPlayer)) {
        return {score: 10};
    } else if (availSpots.length === 0) {
        return {score: 0};
    }
    var moves = [];
    for (var i = 0; i < availSpots.length; i++) {
        var move = {};
        move.index = newBoard[availSpots[i]];
        newBoard[availSpots[i]] = player;

        if (player == aiPlayer) {
            var result = minimax(newBoard, huPlayer);
            move.score = result.score;
        } else {
            var result = minimax(newBoard, aiPlayer);
            move.score = result.score;
        }

        newBoard[availSpots[i]] = move.index;
        moves.push(move);
    }

    var bestMove;
    if(player === aiPlayer) {
        var bestScore = -10000;
        for(var i = 0; i < moves.length; i++) {
            if (moves[i].score > bestScore) {
                bestScore = moves[i].score;
                bestMove = i;
            }
        }
    } else {
        var bestScore = 10000;
        for(var i = 0; i < moves.length; i++) {
            if (moves[i].score < bestScore) {
                bestScore = moves[i].score;
                bestMove = i;
            }
        }
    }

    return moves[bestMove];
}

I think I understand most of it, but there are a few gaps preventing me from getting my mind around it completely.

As I understand it, minimax(newBoard, player) starts by getting the available spots from which to play a move, and establishing a way of ranking the end results.

Then it creates an array moves into which objects called move will go. The for loop gets a move object for each available spot.

Each move object gets a property called index through move.index = newBoard[availSpots[i]];

  • Does newBoard[availSpots[i]] simply represent the indexes of the available sorts. Is it right to say that in a board that has available spots [4, 5, 6], newBoard[availSpots[0]] is 4 and thus the first move object will have an index property of 4?

    The new part of code is newBoard[availSpots[i]] = player -- does that mean the player's icon is marked in newBoard[4]?

After that, there is an if else statement that adds a .score property to the move object.

  • But then I see newBoard[availSpots[i]] = move.index, which reverses what we did earlier -- why is this?

Then the latest move is pushed into the moves array, and from the array, we loop through the scores to find the best move of moves.

I'm having a hard time seeing how this all works. I tried putting in a console.log and my repl.it fails...

  • is it because dozens of permutations are being tried by the compiler and it would be ugly to have them all logged? How many moves does the computer have to try?

And finally:

  • Since this is a turn-based game, where in the code is the computer "playing" the other side in order to get a terminal value?

I've gone through a ton of minimax resources online so I'm hoping someone can help -- they all seem to gloss over this. I've looked at:

https://www.freecodecamp.org/news/how-to-make-your-tic-tac-toe-game-unbeatable-by-using-the-minimax-algorithm-9d690bad4b37/

https://www.geeksforgeeks.org/minimax-algorithm-in-game-theory-set-1-introduction/

https://youtu.be/trKjYdBASyQ

https://youtu.be/ovr2sTYhb1I

https://learnersbucket.com/tutorials/js-projects/tic-tac-toe-game-in-javascript-with-bot/

https://steveafrost.com/articles/discovering-the-minimax-algorithm/

Thanks!

Scott Sauyet
  • 49,207
  • 4
  • 49
  • 103
acchang
  • 59
  • 7

2 Answers2

1

Let's start with your last question.

Since this is a turn-based game, where in the code is the computer "playing" the other side in order to get a terminal value?

It's not, not in this code. This function has one purpose, to choose the best move using the minimax algorithm. It does so by manipulating a board object, setting values and resetting them, finding the score of the position. Other code must handle the IO. So it's important to keep in mind that this code is just part of an overall tic-tac-toe system.

This understanding might help clear up some of the other questions.

Is it right to say that in a board that has available spots [4, 5, 6], newBoard[availSpots[0]] is 4 and thus the first move object will have an index property of 4?

While you can think of it that way, there's nothing in this code to describe how the board is represented. So, while it might be squares 1 - 9, it doesn't have to be. they could be a1, a2, a3, b1, b2, b3, c1, c2, c3. Or there could be another representation altogether. What we do know is that the the board has some properties we can reference with [] (numbers, strings, possibly even Symbols) and that availSpots is an array of these values. Clearly it represents the ones that are available.

The new part of code is newBoard[availSpots[i]] = player -- does that mean the player's icon is marked in newBoard[4]?

It means that the internal data structure of the board now identifies that player with the given cell. Again, this code is only for choosing the best move. But it knows nothing about icons or the visible representation of the board. And note that this state of the board is transitory; the manipulation of the board is meant only to help calculate the best move. Other code would actually apply the move it calculates to the board in progress.

After that, there is an if else statement that adds a .score property to the move object.

But then I see newBoard[availSpots[i]] = move.index, which reverses what we did earlier -- why is this?

It is testing the board, available move by available move, to find the best one. It does this by making a move, calculating the resulting score and then resetting that move so it can try a different one. In calculating the move, we might recursively call back to minmax, which will then try its moves sequentially, and this could extend as many as nine layers deep, one for each cell on the board.

Thus if the current board looks like

X O O
4 5 6
X O X

We would get the following analysis:

         max    min    max
X O O
4 5 6
X O X
  |     X O O
  +-->  X 5 6  (score 10)
  |     X O X
  |
  |     X O O
  +-->  4 X 6  (score 10)
  |     X O X
  |
  |     X O O
  +-->  4 5 X
        X O X
          |    X O O
          +--> O 5 X
          |    X O X
          |      |    X O O
          |      +--> O X O  (score 10)
          |           X O X
          |     
          +--> X O O
               4 O X  (score -10)
               X O X 

We try available move 4 for X, find that its score would be +10, then reset 4 to its default, so that we can try 5 for X, scored also as 10. Again we reset 5 to its default. Then we try 6 for X. To score that, we have to go deeper, first trying 4 for O. This requires us to go deeper still and we set 5 to X. That has value +10. We reset that, reset the 4 and try again with 5 for O, which has a score of -10. Following minimax, we can find that the value of X O O / 4 5 X / X O X is -10, and we'd already seen that both X O O / X 5 6 / X O X and X O O / 4 X 6 / X O X have score of +10, so we'd choose one of those. (By this algorithm, the first one, but a more interesting algorithm might choose randomly among the equally good moves.)

I tried putting in a console.log and my repl.it fails...

is it because dozens of permutations are being tried by the compiler and it would be ugly to have them all logged? How many moves does the computer have to try?

We'd have to see what you did to test this, but no, this game is simple enough that you should never be running out of any resources in these calculations. There are fewer than 9! -- which is 362880 -- total games. So I'm guessing you weren't logging correctly.

Scott Sauyet
  • 49,207
  • 4
  • 49
  • 103
  • Wow, thanks for the detailed response! I know my question was pretty involved, so I'm grateful for any response at all, but especially I'm so for some thing as thoughtful as what you wrote. You helped me clear up a few things, on the turn and the way the algorithm works. But I'm still having a little trouble envisioning the mechanics of it all, which I put in another answer this post. Can you spare some thoughts on that as well? – acchang Jan 29 '21 at 23:11
0

In this part of the code:

    var moves = [];
    for (var i = 0; i < availSpots.length; i++) {
        var move = {};
        move.index = newBoard[availSpots[i]];
        newBoard[availSpots[i]] = player;

        if (player == aiPlayer) {
            var result = minimax(newBoard, huPlayer);
            move.score = result.score;
        } else {
            var result = minimax(newBoard, aiPlayer);
            move.score = result.score;
        }

        newBoard[availSpots[i]] = move.index;
        moves.push(move);
    }

To follow from the great map you drew out:

We get an array called moves and then starting from the first entry in availSpots which is [4,5,6], we create an object called move

move.index = newBoard[availSpots[i]]; means the index property of move is 4.

So I can envision this: var move = {index:4}

And then we have: newBoard[availSpots[i]] = player

Which means newBoard = X, O, O, X, 5, 6, X, O, X

Then following the if statement, we get var move = {index:4, score:10};

And finally, newBoard[availSpots[i]] = move.index means going back to original state:

newBoard = X, O, O, 4, 5, 6, X, O, X

And upon pushing moves.push(move):

We get moves = [move {index:4 score:10}};]

As i = 1 in the for loop, we wind up with moves = [move {index:4 score:10}}, move {index:5 score:10}};]

But my understanding falls apart here. If i = 2, var move = {index:6}

Then, newBoard = X, O, O, 4, 5, X, X, O, X

However, var move = {index:6} does not yield a terminal value, so what is the code doing from this point forward?

The code tells us:

if (player == aiPlayer) {
            var result = minimax(newBoard, huPlayer);
            move.score = result.score;
        } else {
            var result = minimax(newBoard, aiPlayer);
            move.score = result.score;
        }

But there is no score at this point as there is no terminal value. I know the else statement must be executed because newBoard is not going to original state newBoard = X, O, O, 4, 5, 6, X, O, X yet, this will not be executed yet: newBoard[availSpots[i]] = move.index

So minimax is called on newBoard = X, O, O, 4, 5, X, X, O, X, and there is a new availSpots, which is [4,5]. This time player is O because elsewhere in the code, player changes after the board has been checked, so newBoard = X, O, O, O, 5, X, X, O, X.

There is no terminal value for newBoard = X, O, O, O, 5, X, X, O, X either, so the process begins again at the if statement.

  • Is it fair to say only when there is a score, can the line after minimax be executed, move.score = result.score and the code can continue to push the move to the array?

There's a lot of unsaid interleaving going on here which makes it hard for me to get my mind around things.

At the initial point, when the board had 3 available spaces, not only do two spots (4 and 5) would have a terminal value, but to get the terminal value for spot 6, the code would have to calculate the best choice in an array of 2 available spaces.

The longer it takes to reach a terminal value, the greater the number of paths. And before you get to the optimal move for you, the code has to calculate the optimal move for all the other choices leading up to it.

  • After the first move, I guess the code has to go through 8! or 40320 games and I guess it would go faster the closer to the end we get? Where would be a good place to put a console.log to see what's going on under the hood?

  • There's a lot that goes unsaid in the code. For example, that if minimax() fails to return a score, it keeps doing so for the other side, and that there are multiple sets of the moves array being generated for every available space before the terminal value is found. How does someone think all of this through, when console.log-ing would be overwhelming?

I built my tic-tac-toe board on my own, so I was hoping by learning the thought process behind this minimax(), I could make some tweaks and applying it to my own, but it's harder than it looks.

  • Finally, in the calculation of var bestMove here the latest moves[i].score replaces the previous best score, is it safe to assume the code would play the first optimal move that it encounters, and never one in the middle, because those moves might equal, but never fulfill if (moves[i].score > bestScore) {bestScore = moves[i].score}?
acchang
  • 59
  • 7
  • I don't know if it will help, but I find useful the analogy between recursion and mathematical induction. I described it in [another answer](https://stackoverflow.com/a/65602041/1243641). Also, a debugger -- especially with conditional breakpoints -- would probably be much easier than `console.log`. – Scott Sauyet Jan 30 '21 at 01:11
  • You should also be able to debug this function with a sample board that already has many moves applied. My example in a separate answer only tests a total of seven boards, so if you can build that board data-structure with the given moves already supplied, you can probably use multiple logging calls without any issues. – Scott Sauyet Jan 30 '21 at 01:14
  • Thanks! I'm still struggling with recursion but I think I worked out another way into this minimax function. I can evaluate simply, choosing between two options at the end of the game, but when the answer takes several levels, I fall apart. Is there a debugger you would recommend? I only know how to use `console.log` – acchang Feb 01 '21 at 20:10
  • As far as a sample board goes, indeed, good idea! Prior to working on the minimax, I had built the tic-tac-toe game with an option to play against another player or completely random opponent, so I was able to input a board until the last 3 squares as a human-to-human game and then `console.log` around the last three moves to get a better idea of what's going on. It still gets very slow with the initial choice of nine boxes though! – acchang Feb 01 '21 at 20:12
  • I'm surprised it's overly slow. But I haven't tested or really thought it through deeply, Of course, you can always randomly make the first move (it's better user experience anyway), and that should cut the time significantly. – Scott Sauyet Feb 01 '21 at 20:20