Genetic programming for simple games, feasible for non-experts?

Question

I have been seriously trying to create a genetic program that will evolve to play tic-tac-toe in an acceptable way. I'm using the genome to generate a function that will then take the board as input and output the result... But it's not working.

Can this program be written in less than 500 lines of code (including blank lines and documentation)? Perhaps my problem is that I'm generating AIs that are too simple.

My research

A Genetic Algorithm for Tic-Tac-Toe (very different from my approach).
http://www.tropicalcoder.com/GeneticAlgorithm.htm (too abstract).
In quite a good number of websites there are references to 'neural networks'. Are they really required?

Important disclaimers

This is NOT homework of any kind, just a personal project for the sake of learning something cool.
This is NOT a 'give me the codz plz', I am looking for high level suggestions. I explicitly don't want ready-made solutions as the answers.

Please give me some help and insight into this 'genetic programming' concept applied to this specific simple problem.

@OnABauer: I think that I am using genetic programming because quoting Wikipedia

In artificial intelligence, genetic programming (GP) is an evolutionary algorithm-based methodology inspired by biological evolution to find computer programs that perform a user-defined task.

And I am trying to generate a program (in this case function) that will perform the task of playing tic-tac-toe, you can see it because the output of the most important genetic_process function is a genome that will then be converted to a function, thus if I understood correctly this is genetic programming because the output is a function.

Program introspection and possible bugs/problems

The code runs with no errors nor crashes. The problem is that in the end what I get is an incompetent AI, that will attempt to make illegal move and be punished with losing each and every time. It is no better than random.

Possibly it is because my AI function is so simple: just making calculations on the stored values of the squares with no conditionals.

High level description

What is your chromosome meant to represent?

My cromosome rapresents a list of functions that will then be used to reduce over the array of the board stored as trinary. OK let me make an example: * Cromosome is: amMMMdsa (length of cromosome must be 8). 1. The first step is converting this to functions following the lookup at the top called LETTERS_TO_FUNCTIONS, this give the functions: [op.add,op.mul,op.mod,op.mod,op.mod,op.floordiv,op.sub,op.add]

The second step is converting the board to a trinary rapresentation. So let's say the board is "OX XOX " we will get [2, 3, 1, 1, 3, 2, 3, 1, 1]

The third step is reducing the trinary raprentation using the functions obtained above. That is best explained by the function down below:

def reduce_by_all_functions(numbers_list,functions_list):
    """
    Applies all the functions in a list to a list of numbers.

    >>> reduce_by_all_functions([3,4,6],[op.add,op.sub])
    1

    >>> reduce_by_all_functions([6,2,4],[op.mul,op.floordiv])
    3
    """

    if len(functions_list) != len(numbers_list) - 1:
        raise ValueError("The functions must be exactly one less than the numbers")
    result = numbers_list[0]
    for index,n in enumerate(numbers_list[1:]):
        result = functions_list[index](result,n)
    return result

Thus yielding the result of: 0 that means that the ai decided to go in the first square

What is your fitness function?

Luckly this is easy to answer.

def ai_fitness(genome,accuracy):
    """
    Returns how good an ai is by letting it play against a random ai many times.
    The higher the value, the best the ai
    """
    ai = from_genome_to_ai(genome)
    return decide_best_ai(ai,random_ai,accuracy)

How does your mutation work?

The son ereditates 80% of the genes from the father and 20% of genes from the mother. There is no kind of random mutation besides that.

And how is that reduce_by_all_functions() being used? I see that it takes a board and a chromosome and returns a number. How is that number used, what is it meant to represent, and... why is it being returned modulo 9?

reduce_by_all_functions() is used to actually apply the functions previously obtained by the cromosome. The number is the square the ai wants to take. It is modulo 9 because it must be between 0 and 8 because the board is 9 spaces.

My code so far:

import doctest
import random
import operator as op

SPACE = ' '
MARK_OF_PLAYER_1 = "X"
MARK_OF_PLAYER_2 = "O"
EMPTY_MARK = SPACE

BOARD_NUMBERS = """
The moves are numbered as follows:

         0 | 1 | 2
         ---------
         3 | 4 | 5
         ---------
         6 | 7 | 8
"""

WINNING_TRIPLETS = [ (0,1,2), (3,4,5), (6,7,8),
                     (0,3,6), (1,4,7), (2,5,8),
                     (0,4,8), (2,4,6) ]

LETTERS_TO_FUNCTIONS = {
    'a': op.add,
    'm': op.mul,
    'M': op.mod,
    's': op.sub,
    'd': op.floordiv
}

def encode_board_as_trinary(board):
    """
    Given a board, replaces the symbols with the numbers
    1,2,3 in order to make further processing easier.

    >>> encode_board_as_trinary("OX  XOX  ")
    [2, 3, 1, 1, 3, 2, 3, 1, 1]

    >>> encode_board_as_trinary("   OOOXXX")
    [1, 1, 1, 2, 2, 2, 3, 3, 3]
    """
    board = ''.join(board)
    board = board.replace(MARK_OF_PLAYER_1,'3')
    board = board.replace(MARK_OF_PLAYER_2,'2')
    board = board.replace(EMPTY_MARK,'1')
    return list((int(square) for square in board))

def create_random_genome(length):
    """
    Creates a random genome (that is a sequences of genes, from which
    the ai will be generated. It consists of randoom letters taken
    from the keys of LETTERS_TO_FUNCTIONS

    >>> random.seed("EXAMPLE")

    # Test is not possible because even with the same
    # seed it gives different results each run...
    """
    letters = [letter for letter in LETTERS_TO_FUNCTIONS]
    return [random.choice(letters) for _ in range(length)]

def reduce_by_all_functions(numbers_list,functions_list):
    """
    Applies all the functions in a list to a list of numbers.

    >>> reduce_by_all_functions([3,4,6],[op.add,op.sub])
    1

    >>> reduce_by_all_functions([6,2,4],[op.mul,op.floordiv])
    3
    """

    if len(functions_list) != len(numbers_list) - 1:
        raise ValueError("The functions must be exactly one less than the numbers")
    result = numbers_list[0]
    for index,n in enumerate(numbers_list[1:]):
        result = functions_list[index](result,n)
    return result

def from_genome_to_ai(genome):
    """
    Creates an AI following the rules written in the genome (the same as DNA does).
    Each letter corresponds to a function as written in LETTERS_TO_FUNCTIONS.
    The resulting ai will reduce the board using the functions obtained.

    >>> ai = from_genome_to_ai("amMaMMss")

    >>> ai("XOX   OXO")
    4
    """
    functions = [LETTERS_TO_FUNCTIONS[gene] for gene in genome]
    def ai(board):
        return reduce_by_all_functions(encode_board_as_trinary(board),functions) % 9
    return ai

def take_first_empty_ai(board):
    """
    Very simple example ai for tic-tac-toe
    that takes the first empty square.

    >>> take_first_empty_ai(' OX O XXO')
    0

    >>> take_first_empty_ai('XOX O XXO')
    3
    """
    return board.index(SPACE)

def get_legal_moves(board):
    """
    Given a tic-tac-toe board returns the indexes of all
    the squares in which it is possible to play, i.e.
    the empty squares.

    >>> list(get_legal_moves('XOX O XXO'))
    [3, 5]

    >>> list(get_legal_moves('X   O XXO'))
    [1, 2, 3, 5]
    """
    for index,square in enumerate(board):
        if square == SPACE:
            yield index

def random_ai(board):
    """
    The simplest possible tic-tac-toe 'ai', just
    randomly choses a random move.

    >>> random.seed("EXAMPLE")

    >>> random_ai('X   O XXO')
    3
    """
    legal_moves = list(get_legal_moves(board))
    return random.choice(legal_moves)


def printable_board(board):
    """
    User Interface function:
    returns an easy to understand representation
    of the board.

    """
    return """
               {} | {} | {}
               ---------
               {} | {} | {}
               ---------
               {} | {} | {}""".format(*board)


def human_interface(board):
    """
    Allows the user to play tic-tac-toe.
    Shows him the board, the board numbers and then asks
    him to select a move.
    """
    print("The board is:")
    print(printable_board(board))
    print(BOARD_NUMBERS)
    return(int(input("Your move is: ")))

def end_result(board):
    """
    Evaluates a board returning:

    0.5 if it is a tie
    1 if MARK_OF_PLAYER_1 won # default to 'X'
    0 if MARK_OF_PLAYER_2 won # default to 'O'

    else if nothing of the above applies return None

    >>> end_result('XXX   OXO')
    1

    >>> end_result(' O X X  O')
    None

    >>> end_result('OXOXXOXOO')
    0.5

    """

    if SPACE not in board:
        return 0.5
    for triplet in WINNING_TRIPLETS:
        if all(board[square] == 'X'  for square in triplet):
            return 1
        elif all(board[square] == 'O'  for square in triplet):
            return 0

def game_ended(board):
    """
    Small syntactic sugar function to if the game is ended
    i.e. no tie nor win occured
    """
    return end_result(board) is not None

def play_ai_tic_tac_toe(ai_1,ai_2):
    """
    Plays a game between two different ai-s, returning the result.
    It should be noted that this function can be used also to let the user
    play against an ai, just call it like: play_ai_tic_tac_toe(random_ai,human_interface)

    >>> play_ai_tic_tac_toe(take_first_empty_ai,take_first_empty_ai)
    1
    """

    board = [SPACE for _ in range(9)]
    PLAYER_1_WIN = 1
    PLAYER_1_LOSS = 0
    while True:
        for ai,check in ( (ai_1,MARK_OF_PLAYER_1), (ai_2,MARK_OF_PLAYER_2) ):
            move = ai(board)

            # If move is invalid you lose
            if board[move] != EMPTY_MARK:
                if check == MARK_OF_PLAYER_1:
                    return PLAYER_1_LOSS
                else:
                    return PLAYER_1_WIN

            board[move] = check
            if game_ended(board):
                return end_result(board)

def loop_play_ai_tic_tac_toe(ai_1,ai_2,games_number):
    """
    Plays games number games between ai_1 and ai_2
    """
    return sum(( play_ai_tic_tac_toe(ai_1,ai_2)) for _ in range(games_number))

def decide_best_ai(ai_1,ai_2,accuracy):
    """
    Returns the number of times the first ai is better than the second:
    ex. if the ouput is 1.4, the first ai is 1.4 times better than the second.

    >>> decide_best_ai(take_first_empty_ai,random_ai,100) > 0.80
    True
    """
    return sum((loop_play_ai_tic_tac_toe(ai_1,ai_2,accuracy//2),
               loop_play_ai_tic_tac_toe(ai_2,ai_1,accuracy//2))) / (accuracy // 2)

def ai_fitness(genome,accuracy):
    """
    Returns how good an ai is by lettting it play against a random ai many times.
    The higher the value, the best the ai
    """
    ai = from_genome_to_ai(genome)
    return decide_best_ai(ai,random_ai,accuracy)

def sort_by_fitness(genomes,accuracy):
    """
    Syntactic sugar for sorting a list of genomes based on the fitness.
    High accuracy will yield a more accurate ordering but at the cost of more
    computation time.
    """
    def fit(genome):
        return ai_fitness(genome,accuracy)

    return list(sorted(genomes, key=fit, reverse=True))
    # probable bug-fix because high fitness means better individual

def make_child(a,b):
    """
    Returns a mix of cromosome a and cromosome b.
    There is a bias towards cromosome a because I think that
    a too weird soon is going to be bad.
    """
    result = []
    for index,char_a in enumerate(a):
        char_b = b[index]
        if random.random() > 0.8:
            result.append(char_a)
        else:
            result.append(char_b)
    return result

def genetic_process(population_size,generation_number,accuracy,elite_number):
    """
    A full genetic process yielding a good tic-tac-toe ai. # not yet

    @ Parameters:
         @ population_size: the number of ai-s that you allow to be alive
              at once
         @ generation_number: the number of generations of the gentetic
         @ accuracy: how well the ai-s are ordered,
              low accuracy means that a good ai may be considered bad or
              viceversa. High accuarcy is computationally costly
         @ elite_number: the number of best programmes that get to reproduce
              at each generation.

    @ Return:
          @ A genome for a tic-tac-toe ai
    """
    pool = [create_random_genome(9-1) for _ in range(population_size)]
    for generation in range(generation_number):
        best_individuals = sort_by_fitness(pool,accuracy)[:elite_number]
        the_best = best_individuals[0]
        for good_individual in best_individuals:
            pool.append(make_child(the_best,good_individual))
        pool = sort_by_fitness(pool,accuracy)[:population_size]
    return the_best

def _test():
    """
    Tests all the script by running the

    >>> 2 + 2 # code formatted like this
    4
    """
    doctest.testmod()

def main():
    """
    A simple demo to let the user play against a genetic opponent.
    """
    print("The genetic ai is being created... please wait.")
    genetic_ai = from_genome_to_ai(genetic_process(50,4,40,25))
    play_ai_tic_tac_toe(genetic_ai,human_interface)

if __name__ == '__main__':
    main()

Just a minor note. You mention genetic programming a few times when I think you mean genetic algorithms. The two are quite different concepts and it doesn't look like you use genetic programming at all. — OnABauer, Jan 22 '15 at 08:05
How is your program not working? What results are you getting? — OnABauer, Jan 22 '15 at 08:08
The most glaring problem is that your machine for operations (from_genome_to_ai) seems incapable of producing anything useful for any purpose. There are other things that severely limit readability: too few docstrings and the ordering of the functions which seems nearly random. — msw, Jan 22 '15 at 14:47
@msw I added some docstrings, it would be nice of you to tell me how to order the functions better and how to improve `from_genome_to_ai `. — Caridorc, Jan 22 '15 at 20:35
Can you please describe in prose what your algorithm is trying to do? In more detail than, "Trying to play tic tac toe." For instance, what is your chromosome meant to represent, what is your fitness function, how does your mutation work, etc? I will be happy to try and help, because I have a good working knowledge of GAs and AI, but reading through a code dump is... inefficient. — Novak, Jan 25 '15 at 21:05
@Novak I added in the question the details you asked, I hope they are clear, if you need more informations fell free to ask. — Caridorc, Jan 25 '15 at 21:23
And how is that reduce_by_all_functions() being used? I see that it takes a board and a chromosome and returns a number. How is that number used, what is it meant to represent, and... why is it being returned modulo 9? — Novak, Jan 25 '15 at 22:06
@Novak I tried to answer, perhaps the from_genome_to_ai functions is the broken part of the programme? — Caridorc, Jan 25 '15 at 22:30

score 8 · Accepted Answer · edited May 23 '17 at 11:43

First and foremost, I am obligated to say that Tic Tac Toe is really too simple a problem to reasonably attack with a genetic program. You simply don't need the power of a GP to win Tic Tac Toe; you can solve it with a brute force lookup table, or a simple game tree.

That said, if I understand correctly, your basic notion is this:

1) Create chromosomes of length 8, where each gene is an arithmetic operation, and the 8-gene chromosome acts on each board as a board evaluation function. That is, a chromosome takes in a board representation, and spits out a number representing the goodness of that board.

It's not perfectly clear that this is what you're doing, because your board representations are each 9 integers (1, 2, 3 only) but your examples are given in terms of the "winning triples" which are 2 integers (0 through 8).

2) Start the AI up and, on the AI's turn it should get a list of all legal moves, evaluate the board per its chromosome for each legal move and... take that number, modulo 9, and use that as the next move? Surely there's some code in there to handle the case where that move is illegal....

3) Let a bunch of these chromosome representations either play a standard implementation, or play each other, and determine the fitness based on the number of wins.

4) Once a whole generation of chromosomes has been evaluated, create a new generation. It's not clear to me how you are selecting the parents from the pool, but once the parents are selected, a child is produced by just taking individual genes from the parents by an 80-20 rule.

Your overall high level strategy is sound, but there are a lot of conceptual and implementation flaws in the execution. First, let's talk about fully observable games and simple ways to make AIs for them. If the game is very simple (such as Tic Tac Toe) you can simply make a brute force minimax game tree, such as this. TTT is simple enough that even your cell phone can go all the way to the bottom of the tree very quickly. You can even solve it by brute force with a look up table: Just make a list of all board positions and the response to each one.

When the games get larger-- think checkers, chess, go-- that is no longer true, and one of the ways around this is to develop what's called a board evaluation function. It is a function which takes a board position and returns a number, usually with higher being better for one player and lower being better for the other. One then executes a search to certain acceptable depth and aims for the highest (say) board evaluation function.

This begs the question: How do we come up with the board evaluation function? Originally, one asked experts at the game to develop these function for you. There is a great paper by Chellapilla and Fogel which is similar to what you want to do for checkers-- they use neural networks to determine the board evaluation functions, and, critically, these neural networks are encoded as genomes and evolved. They are then used in search depth 4 trees. The end results are very competitive against human players.

You should read that paper.

What you are trying to do, I think, is very similar, except instead of coding a neural network as a chromosome, you're trying to code up a very restricted algebraic expression, always of the form:

((((((((arg1 op1 arg2) op2 arg3) op3 arg4) op4 arg5) op5 arg6) op6 arg7) op7 arg8) op8 arg)

... and then you're using it mod 9 to pick a move.

Now let's talk about genetic algorithms, genetic programs, and the creation of new children. The whole idea in evolutionary techniques is to combine the best attributes of two hopefully-good solutions in the hopes that they will be even better, without getting stuck in a local maximum.

Generally, this is done by touranment selection, crossover, and mutation. Tournament selection means selecting the parents proportionally to their fitness. Crossover means dividing the chromosomes into two usually contiguous regions and taking one region from one parent and the other region from the other parent. (Why contiguous? Because Holland's Schema Theorem) Mutation means occasionally changing a gene, as a means of maintaining population diversity.

Now let's look at what you're doing:

1) Your board evaluation function-- the function that your chromosome turns into, which acts on the board positions-- is highly restricted and very arbitrary. There's not much rhyme or reason to assigning 1, 2, and 3 as those numbers, but that might possibly be okay. The bigger flaw is that your functions are a terribly restricted part of the overall space of functions. They are always the same length, and the parse tree always looks the same.

There's no reason to expect anything useful to be in this restrictive space. What's needed is to come up with a scheme which allows for a much more general set of parse trees, including crossover and mutation schemes. You should look up some papers or books by John Koza for ideas on this topic.

Note that Chellapilla and Fogel have fixed forms of functions as well, but their chromosomes are substantially larger than their board representations. A checkers board has 32 playable spaces, and each space can have 5 states. But their neural network had some 85 nodes, and the chromosome comprised the connection weights of those nodes-- hundreds, if not thousands, of values.

2) Then there's this whole modulo 9 thing. I don't understand why you're doing that. Don't do that. You're just scrambling whatever information might be in your chromosomes.

3) Your function to make new children is bad. Even as a genetic algorithm, you should be dividing the chromosomes in two (at random points) and taking part of one parent from one side, and the other part from the other parent on the other side. For genetic programming, which is what you're doing, there are analogous strategies for doing crossovers on parse trees. See Koza.

You must include mutation, or you will almost certainly get sub-optimal results.

4a) If you evaluate the fitness by playing against a competent AI, then realize that your chromosomes will never, ever win. They will lose, or they will draw. A competent AI will never lose. Moreover, it is likely that your AIs will lose all the time and initial generations may all come out as equally (catastrophically) poor players. It's not impossible to get yourelf out of that hole, but it will be hard.

4b) On the other hand, if, like Chellapilla and Fogel, you play the AIs against them selves, then you'd better make certain that the AIs can play either X or O. Otherwise you're never going to make any progress at all.

5) Finally, even if all these concerns are addressed, I'm not convinced this will get great results. Note that the checkers example searches to a depth of 4, which is not a big horizon in a game of checkers that might last 20 or 30 moves.

TTT can only ever last 9 moves.

If you don't do a search tree and just go for the highest board evaluation function, you might get something that works. You might not. I'm not sure. If you search to depth 4, you might as well skip to a full search to level 9 and do this conventionally.

Excellent answer, I really enjoyed reading it, thanks. – Caridorc Jan 26 '15 at 15:37 — Caridorc, Jan 26 '15 at 15:37

Genetic programming for simple games, feasible for non-experts?

My research

Important disclaimers

Program introspection and possible bugs/problems

High level description

1 Answers1