Othello/Reversi minimax algorithm glitch

Question

I've been working recently on a MiniMax algorithm for a standard 8x8 Othello (Reversi) game on Android in Java. Every log seems to be showing correct values for every node, and yet somehow the algorithm just doesn't choose the optimum move (I compared the behavior with someone else's application, and at some point they differ).

There is quite a bit of code to swallow, I'm afraid, but I believe you can do this!

First of all, my Algorithm class (from that class I inherit the MiniMax algorithm) :

public abstract class Algorithm 
{
/** game instance to which the strategy is applied */
Game game;

public Algorithm (Game game)
{
    this.game = game;
}

/**Finds player's all possible moves for the current state   * 
 * @param gb - a game board for which the moves are found
 * @param player - current player
 * @return - an array list of possible fields to move
 */
public ArrayList<Field> findAllPossibleMoves(GameBoard gb, boolean player)
{
    ArrayList <Field> moves = new ArrayList <Field> ();
    for (int y= 0; y<gb.getHeight();y++) for (int x= 0; x<gb.getWidth();x++)
    {
        Field f = new Field (x,y);
        if (game.isAvailable(gb, f, player)) moves.add(f);          
    }
    return moves;
}

/** Searches for the right move 
 * @param gb - game board to do the search on
 * @param depth maximum depth of search tree
 * @param player - current player
 * @return - optimum move (Field)
 */
public abstract Field findBestMove(GameBoard gb, int depth, boolean player);

/**Evaluates the player's current state  * 
 * @param gb - game board to be evaluated
 * @param player - player for which we evaluate
 * @return - the current rating (int)
 */ 
public int evaluate(GameBoard gb) 
{
    int rating = 0;     
    int[][] eval_table = {
            {99,  -8,  8,  6,  6,  8,  -8, 99},
            {-8, -24, -4, -3, -3, -4, -24, -8},
            { 8,  -4,  7,  4,  4,  7,  -4,  8},
            { 6,  -3,  4,  0,  0,  4,  -3,  6},
            { 6,  -3,  4,  0,  0,  4,  -3,  6},
            { 8,  -4,  7,  4,  4,  7,  -4,  8},
            {-8, -24, -4, -3, -3, -4, -24, -8},
            {99,  -8,  8,  6,  6,  8,  -8, 99}
    };
    for (int x=0; x<8; x++) for (int y=0;y<8;y++)
    {
        if(gb.board[x][y] == GameBoard.WHITE)
        {
            //rating+=eval_table[x][y];
            rating++;
        }
    }
    return rating;
}
}

The findBestMove function is of course overriden in the child MiniMax class:

@Override
public Field findBestMove(GameBoard gb, int depth, boolean player) 
{   
    /** maximum depth of search reached, we stop */
    if(depth >= max_depth) return null;

    //player = (depth+1)%2 + 1;

    /** getting a list of moves to chose from */
    ArrayList <Field> moves = findAllPossibleMoves(gb, player); 

    Field best_move = null;

    /** iterating over all possible moves, to find the best one */      
    for (int i=0; i<moves.size(); i++)
    {
        /** board to simulate moves */
        GameBoard temp_board = new GameBoard(gb);
        /** getting the current move */
        Field move = moves.get(i);      
        /** simulating the move for the current node */
        game.move(move, temp_board, player);
        Log.i("board", "Depth:"+depth+" Player:"+player+" Move:"+i+" Rating:"+evaluate(temp_board));
        Log.i("board", ""+moves.get(i).getX()+","+moves.get(i).getY());         
        temp_board.printBoard();
        /** getting to the next inferior node */            
        Field best_deep_move = findBestMove (temp_board, depth + 1, !player);           

        /** if the maximum depth is reached, we have a null, so we evaluate */
        if (best_deep_move == null) 
        {
            move.setRating(evaluate (temp_board));
        }
        /** if we are not the deepest possible, we get the rating from the lower node */
        else 
        {
            move.setRating(best_deep_move.getRating());         
            Log.i("eval", ""+best_deep_move.getRating());
        }           
        if(best_move == null) 
        {
            best_move = move;           
        }

        else
        {   
            Log.i("update", "Current move rating:"+move.getRating());
            Log.i("update", "New move rating:"+best_move.getRating());
            if (depth%2==0)
            {
                Log.i("update", "MAX player");
                /** for us, we look for the maximum */
                if (best_move.getRating() < move.getRating()) 
                    {


                    best_move = move;

                    }

            }
            else
            {
                Log.i("update", "MIN player");
                /** for the opponent, we look for the minimum */
                if (best_move.getRating() > move.getRating())
                { 


                    best_move = move;

                }
            }
            Log.i("update", "Updated move rating"+best_move.getRating());
        }
    }

    return best_move;
}

The algorithm is instantiated in my current Game instance:

public class Game {

public GameBoard game_board;
public Algorithm alg;

boolean active_player;
//int other_player;
int white_count;
int black_count;
int depth;

public Game()
{
    game_board = new GameBoard(8, 8);
    active_player = true;
    //other_player = GameBoard.WHITE;
    white_count = 0;
    black_count = 0;


    // AI strategy
    alg = new MiniMaxAlgorithm(this);
}


public boolean getActive_player() {
    return active_player;

}


public void setActive_player(boolean active_player) {
    this.active_player = active_player;
}

//public boolean getOther_player() {
    //return other_player;
//}


//public void setOther_player(boolean other_player) {
    //this.other_player = other_player;
//  }



public int getWhite_count() {
    return white_count;
}


public void setWhite_count(int white_count) {
    this.white_count = white_count;
}


public int getBlack_count() {
    return black_count;
}


public void setBlack_count(int black_count) {
    this.black_count = black_count;
}

// array of 8 possible directions
public final int[][] directions = new int [][]{
        {-1, 0},
        {-1, 1},
        {0, 1},
        {1, 1},
        {1, 0},
        {1, -1},
        {0, -1},
        {-1, -1}            
};


public int playerToPiece (boolean player)
{
    return player == true? GameBoard.BLACK : GameBoard.WHITE;
}

/** Function checking if a given field is available to move
 * 
 * @param f - field to be checked
 * @return - whether or not the field is available (boolean)
 */

public boolean isAvailable (GameBoard gb, Field f, boolean player)
{
    if(gb.board[f.x][f.y] != GameBoard.EMPTY) return false;
    /** iterating over all 8 possible directions */
    for (int k=0; k<directions.length; k++)
    {
        int dx = directions[k][0];
        int dy = directions[k][1];          
        /** scanning through all fields in line with the current direction */           
        for (int i = f.x + dx, j = f.y + dy; 
                i>=0 && i<gb.getWidth() && j>=0 && j<gb.getHeight() ; i+=dx, j+=dy)
        {
            if(i == f.x + dx && j == f.y + dy && gb.board[i][j] == playerToPiece(player)) break;
            if(gb.board[i][j] == GameBoard.EMPTY) break;
            else if(gb.board[i][j] == playerToPiece(player)) return true;
        }
    }
    return false;
}

/** Function performing a move
 * 
 * @param f - destination field of the move
 * @param board - board to do the move on
 * @param player - player that performs the move
 */

public void move (Field f, GameBoard board, boolean player)
{

    if (f == null) return;
    if (isAvailable(board, f, player))
    {
        board.board[f.x][f.y] = playerToPiece(player); 
        /** iterating over all 8 possible directions */
        for (int k=0;k < directions.length; k++)
        {
            int dx = directions[k][0];
            int dy = directions[k][1];
            /** scanning through all fields in line with the current direction */               
            for (int i = f.x + dx, j = f.y + dy; 
                    i>=0 && j>=0 && i<board.getWidth() && j<board.getHeight() ; i+=dx, j+=dy)
            {
                /** if there is an empty field, there can be no line, so we move on to the next dir */
                if(board.board[i][j] == GameBoard.EMPTY) break;
                /** if the field is the same as active, we fill all the fields up to that one with 
                 * that color  */
                else if(board.board[i][j] == playerToPiece(player))
                {
                    for (int a = f.x, b = f.y; a*dx<=i*dx && b*dy<=j*dy; a+=dx, b+=dy)
                    {
                        board.board[a][b] = playerToPiece(player);
                    }
                    break;
                }
            }
        }

        white_count = black_count = 0;
        for (int i = 0; i < board.getWidth(); i++)
        {
            for (int j=0; j < board.getHeight(); j++)
            {
                if (board.board[i][j] == GameBoard.WHITE) white_count++;
                else if (board.board[i][j] == GameBoard.BLACK) black_count++;
            }
        }
    }
}


/** Function changing the active player
 * 
 */
public void changePlayer ()
{
    active_player = !active_player;
    //active_player = active_player == GameBoard.BLACK ? GameBoard.WHITE : GameBoard.BLACK;
    //other_player = other_player == GameBoard.BLACK ? GameBoard.WHITE : GameBoard.BLACK;
}

}

I really DID make a research effort, but I must admit it is not easy track down and error with a recurrent function just by googling the problem. That makes me pretty stuck with further optimizations (alpha-beta, etc.), and I really, really need to speed up now.

Can you give more info about how the incorrect output looks like, where / when exactly it happens and what the correct output should be? — IVlad, Dec 26 '13 at 09:45
For instance, the AI has done a move to the field (3,2) (4th column and third row), having 4 pieces of his own (see the picture below): http://oi41.tinypic.com/286sfg7.jpg While he could have moved to (6,4) (7th column, 5th row), taking over two black pieces to the left, and having a total of 5 pieces. I thought it might just be the fact, that he thinks his move is better long-term, but as I said, in the correct implementation it moves like I proposed. And I've observed that the algoritm tends to choose the first move it finds... — Michał Szydłowski, Dec 26 '13 at 09:58
Maybe they're the same long term and it just picks the first it finds? — IVlad, Dec 26 '13 at 10:25
To expand on my previous comment, each possible move should be assigned a score. Maybe the scores of the two moves you mentioned are the same, it just so happens that you pick this one. Maybe this move gets a higher score for some reason. You should check the scores of each move and try to see how exactly they were computed. It's hard to tell anything just by looking at the code, but there are plenty of paths you can take to debug this. There might also not be any **correct** implementation, since two implementations can use different ranking functions, and it's hard to tell which is better. — IVlad, Dec 26 '13 at 10:43
The evaluation is the same actually. But I investigated deeper the logs,and it turns out my alg works fine, it really did choose the branch I would have chosen myself, filling up the minimax tree. Don't know now how the other implementation is correct, but it doesn't matter ;) Thanks for the advice anyway, it triggered some ideas definitely. Thumb up! — Michał Szydłowski, Dec 26 '13 at 11:20

Othello/Reversi minimax algorithm glitch

0 Answers0