2057

I have recently stumbled upon the game 2048. You merge similar tiles by moving them in any of the four directions to make "bigger" tiles. After each move, a new tile appears at random empty position with a value of either 2 or 4. The game terminates when all the boxes are filled and there are no moves that can merge tiles, or you create a tile with a value of 2048.

One, I need to follow a well-defined strategy to reach the goal. So, I thought of writing a program for it.

My current algorithm:

while (!game_over) {
    for each possible move:
        count_no_of_merges_for_2-tiles and 4-tiles
    choose the move with a large number of merges
}

What I am doing is at any point, I will try to merge the tiles with values 2 and 4, that is, I try to have 2 and 4 tiles, as minimum as possible. If I try it this way, all other tiles were automatically getting merged and the strategy seems good.

But, when I actually use this algorithm, I only get around 4000 points before the game terminates. Maximum points AFAIK is slightly more than 20,000 points which is way larger than my current score. Is there a better algorithm than the above?

xan
  • 7,511
  • 2
  • 32
  • 45
nitish712
  • 19,504
  • 5
  • 26
  • 34
  • 93
    This might help! http://ov3y.github.io/2048-AI/ – cegprakash Mar 12 '14 at 06:12
  • 5
    @nitish712 by the way, your algorithm is greedy since you have `choose the move with large number of merges` which quickly lead to local optima – Khaled.K Mar 12 '14 at 12:45
  • 24
    @500-InternalServerError: If *I* were to implement an AI with alpha-beta game tree pruning, it would be assuming that the new blocks are adversarially placed. It's a worst-case assumption, but might be useful. – Charles Mar 14 '14 at 20:52
  • 9
    A fun distraction when you don't have time to aim for a high score: Try to get the lowest score possible. In theory it's alternating 2s and 4s. – Mark Hurd Mar 19 '14 at 00:31
  • @AnonymousPi it's actually theoretically impossible to have a game with not a single merge, even if you are trying to do so, as the generation ratio of two to four is 9:1. – Vogel612 Mar 26 '14 at 15:23
  • 7
    Discussion on this question's legitimacy can be found on meta: http://meta.stackexchange.com/questions/227266/is-the-2048-thread-really-on-topic – Jeroen Vannevel Mar 30 '14 at 20:37
  • 2
    If someone is looking for a Reinforcement Learning (unsupervised) implementation using Keras & Tensorflow then my code is here. https://github.com/codetiger/MachineLearning-2048 – codetiger Jul 23 '17 at 14:32
  • http://jdlm.info/articles/2018/03/18/markov-decision-process-2048.html ... if you are interested in understanding the optimal play with MDP this blog explains it succinctly. – SupposeXYZ May 14 '18 at 13:24
  • The question is to find an optimal strategy. All the answers so far deal with AI approximations. The obvious yet ignored way to get an optimal strategy is to apply mathematics to the problem. If all the patterns and all the pattern components can be expressed in an algebraic (group) way, then theorems can be made concerning the paths that lead from start to win, and how their segments combine. Unlike chess or Go, which are too difficult to analyze fully, 2048 has a mathematical simplicity that might make this analysis possible. Tetris too? And parts of Rubik's Cube? – David Spector Jun 04 '20 at 13:58
  • I almost want to downvote this just to have the # of votes be 2048 :) – ldog Jun 05 '23 at 19:26

14 Answers14

1359

I developed a 2048 AI using expectimax optimization, instead of the minimax search used by @ovolve's algorithm. The AI simply performs maximization over all possible moves, followed by expectation over all possible tile spawns (weighted by the probability of the tiles, i.e. 10% for a 4 and 90% for a 2). As far as I'm aware, it is not possible to prune expectimax optimization (except to remove branches that are exceedingly unlikely), and so the algorithm used is a carefully optimized brute force search.

Performance

The AI in its default configuration (max search depth of 8) takes anywhere from 10ms to 200ms to execute a move, depending on the complexity of the board position. In testing, the AI achieves an average move rate of 5-10 moves per second over the course of an entire game. If the search depth is limited to 6 moves, the AI can easily execute 20+ moves per second, which makes for some interesting watching.

To assess the score performance of the AI, I ran the AI 100 times (connected to the browser game via remote control). For each tile, here are the proportions of games in which that tile was achieved at least once:

2048: 100%
4096: 100%
8192: 100%
16384: 94%
32768: 36%

The minimum score over all runs was 124024; the maximum score achieved was 794076. The median score is 387222. The AI never failed to obtain the 2048 tile (so it never lost the game even once in 100 games); in fact, it achieved the 8192 tile at least once in every run!

Here's the screenshot of the best run:

32768 tile, score 794076

This game took 27830 moves over 96 minutes, or an average of 4.8 moves per second.

Implementation

My approach encodes the entire board (16 entries) as a single 64-bit integer (where tiles are the nybbles, i.e. 4-bit chunks). On a 64-bit machine, this enables the entire board to be passed around in a single machine register.

Bit shift operations are used to extract individual rows and columns. A single row or column is a 16-bit quantity, so a table of size 65536 can encode transformations which operate on a single row or column. For example, moves are implemented as 4 lookups into a precomputed "move effect table" which describes how each move affects a single row or column (for example, the "move right" table contains the entry "1122 -> 0023" describing how the row [2,2,4,4] becomes the row [0,0,4,8] when moved to the right).

Scoring is also done using table lookup. The tables contain heuristic scores computed on all possible rows/columns, and the resultant score for a board is simply the sum of the table values across each row and column.

This board representation, along with the table lookup approach for movement and scoring, allows the AI to search a huge number of game states in a short period of time (over 10,000,000 game states per second on one core of my mid-2011 laptop).

The expectimax search itself is coded as a recursive search which alternates between "expectation" steps (testing all possible tile spawn locations and values, and weighting their optimized scores by the probability of each possibility), and "maximization" steps (testing all possible moves and selecting the one with the best score). The tree search terminates when it sees a previously-seen position (using a transposition table), when it reaches a predefined depth limit, or when it reaches a board state that is highly unlikely (e.g. it was reached by getting 6 "4" tiles in a row from the starting position). The typical search depth is 4-8 moves.

Heuristics

Several heuristics are used to direct the optimization algorithm towards favorable positions. The precise choice of heuristic has a huge effect on the performance of the algorithm. The various heuristics are weighted and combined into a positional score, which determines how "good" a given board position is. The optimization search will then aim to maximize the average score of all possible board positions. The actual score, as shown by the game, is not used to calculate the board score, since it is too heavily weighted in favor of merging tiles (when delayed merging could produce a large benefit).

Initially, I used two very simple heuristics, granting "bonuses" for open squares and for having large values on the edge. These heuristics performed pretty well, frequently achieving 16384 but never getting to 32768.

Petr Morávek (@xificurk) took my AI and added two new heuristics. The first heuristic was a penalty for having non-monotonic rows and columns which increased as the ranks increased, ensuring that non-monotonic rows of small numbers would not strongly affect the score, but non-monotonic rows of large numbers hurt the score substantially. The second heuristic counted the number of potential merges (adjacent equal values) in addition to open spaces. These two heuristics served to push the algorithm towards monotonic boards (which are easier to merge), and towards board positions with lots of merges (encouraging it to align merges where possible for greater effect).

Furthermore, Petr also optimized the heuristic weights using a "meta-optimization" strategy (using an algorithm called CMA-ES), where the weights themselves were adjusted to obtain the highest possible average score.

The effect of these changes are extremely significant. The algorithm went from achieving the 16384 tile around 13% of the time to achieving it over 90% of the time, and the algorithm began to achieve 32768 over 1/3 of the time (whereas the old heuristics never once produced a 32768 tile).

I believe there's still room for improvement on the heuristics. This algorithm definitely isn't yet "optimal", but I feel like it's getting pretty close.


That the AI achieves the 32768 tile in over a third of its games is a huge milestone; I will be surprised to hear if any human players have achieved 32768 on the official game (i.e. without using tools like savestates or undo). I think the 65536 tile is within reach!

You can try the AI for yourself. The code is available at https://github.com/nneonneo/2048-ai.

David Greydanus
  • 2,551
  • 1
  • 23
  • 42
nneonneo
  • 171,345
  • 36
  • 312
  • 383
  • 2
    What if you add a free-tile-maximizing heuristic? – Claudiu Apr 02 '14 at 16:29
  • 2
    @Claudiu: That's precisely what the "open square" heuristic does. In fact, it's one of the strongest heuristics. – nneonneo Apr 02 '14 at 17:18
  • 2
    @nneonneo: Whoops, totally missed it! I found it to be the same in my experience. I also did an expectimax AI for the iphone game "threes" (the one that spawned all these) and what worked best is an EV function along the lines of `(something for open square + sum of squares + 10*bottom-left + 9*bottom-second-left + 8*bottom-third-left)`. Too fun to watch these AIs play the game – Claudiu Apr 02 '14 at 17:22
  • @Claudiu: I actually started off with an AI for Threes, which actually is more sophisticated than my 2048 AI because it models the "deck" state. I took that AI and pared it down to make this 2048 AI. However, I never added a monotonicity heuristic for the Threes AI; maybe that's something that's worth adding to it. – nneonneo Apr 02 '14 at 17:25
  • New 2's seem to appear a lot more often than new 4's. Does anyone know the split? Expected %age of new 2's appearing vs expected %age of new 4's? – Rob L Apr 04 '14 at 04:43
  • 17
    @RobL: 2's appear 90% of the time; 4's appear 10% of the time. It's in the [source code](https://github.com/gabrielecirulli/2048/blob/master/js/game_manager.js#L75): `var value = Math.random() < 0.9 ? 2 : 4;`. – nneonneo Apr 04 '14 at 05:22
  • 42
    Currently porting to Cuda so the GPU does the work for even better speeds! – nimsson Apr 11 '14 at 21:54
  • I wonder if it would be worth exploiting the symmetry of the board to improve the effectiveness of your transposition table. – user1502040 Apr 13 '14 at 21:17
  • 3
    Fantastic description and beautiful compaction of data. I wonder what would happen if you added a third heuristic: reward two adjacent tiles that are 2x apart (e.g. 4096/8192). Maybe only do this for 1024 and above as the 8 depth search and the 1st heuristic already work to combine everything below that. Reward could be constant or the larger value of the two tiles, thereby emphasizing importance of keeping the largest closest. From your explanation, it appears there'd be no new code, only an adjustment to your lookup table. – Andrew Philips Apr 17 '14 at 14:41
  • It would be interesting to use your algorithm to guide the currently random selection of new block placement, to make it minimize the potential score rather – David Fraser May 25 '14 at 06:38
  • I have been studying your code and technique for an AI course, great stuff :) I've noticed that your scoring function does not account for the fact that when the computer spawns a 4, you should not be awarded 4 points (to replicate the original version). So I believe your maximum scores will be slightly off. – Brendan Annable May 27 '14 at 12:12
  • @BrendanAnnable: The AI does not use "true" score in its heuristic; the score computed is essentially for informational purposes. (When running the command-line version, a "penalty" is assigned for spawned 4s which brings the actual scores in line with the official version). The scores I posted in my answer are from the actual browser version. – nneonneo Jul 18 '14 at 18:57
  • 31
    @nneonneo I ported your code with emscripten to javascript, and it works quite well [in the browser](http://users.telenet.be/reverse_engineer/2048ai) now! Cool to watch, without the need to compile and everything... In Firefox, performance is quite good... – reverse_engineer Aug 23 '14 at 17:11
  • 1
    @reverse_engineer: That's pretty good! And yeah, I'm surprised at how well it performs. Guess the JavaScript JIT has gotten very good :) – nneonneo Aug 23 '14 at 18:43
  • 1
    Have you noticed that many board positions arent achievable? (For example, all the tiles being the same value, or a number higher than 4 being "isolated" from the rest). Taking this into account you could greatly improve the lookup table speeds! – Pabce Aug 31 '14 at 17:00
  • 1
    @Pabce I don't think that reducing the amount of entries in the lookup table arrays would increase speed significantly, since it's just a lookup at a certain offset in arrays that are probably kept entirely in L2 cache anyway... – reverse_engineer Oct 22 '14 at 08:59
  • Notice it is not possible to obtain a tile of 131072 in a 4x4 grid. Theoretical limit is 65536. Therefore, you are close. – Jérôme Pouiller May 15 '15 at 09:53
  • 7
    Theoretical limit in a 4x4 grid actually IS 131072 not 65536. However that requires getting a 4 in the right moment (i.e. the entire board filled with 4 .. 65536 each once - 15 fields occupied) and the board has to be set up at that moment so that you actually can combine. – Bodo Thiesen Jul 27 '15 at 15:16
  • @rookie: yeah, look back in time to some of the earlier commits. – nneonneo Sep 04 '15 at 04:17
  • 9
    @nneonneo You might want to check our AI, which seems even better, getting to 32k in 60% of games: https://github.com/aszczepanski/2048 – cauchy Dec 23 '15 at 17:21
  • can you represent values above 16384 with only 4 bits? – sp2danny Nov 14 '16 at 18:04
  • 1
    I can represent 2^1 up to 2^15 (32768) with this approach. The code additionally is set up to assume it can merge two 32768 tiles to form another 32768 (in reality a 65536), which would allow it to play out a situation in which it obtains 65536. – nneonneo Nov 14 '16 at 18:08
  • so how do you represent 1? – sp2danny Nov 14 '16 at 22:25
  • @sp2danny: there is no "1" in 2048. There's only 0 (represented with a 0), and 2,4,8,16,...,32768. That's 16 distinct values. – nneonneo Nov 14 '16 at 22:57
  • Great algorithm! I have two questions about your transposition table. Firstly, why do you generate it from scratch for each move? Isn't it possible to use board states evaluated from the previous move in the current move? Secondly, there are two thresholds for terminating recursion: 1) the cumulative probability of the current leaf (cprob), and 2) the depth. Have you considered the case when the transposition table has an entry that meets the minimum depth criterion, but is nearly at the probability threshold, so actually stores an answer that does not descend as deeply as you would like? – Martin Cook Nov 09 '17 at 08:11
  • @MartinCook: For the first question, largely laziness :). At the time I added the transposition table, I was aware it could be used to extend prior round computations and save time, but I wasn't sure how to actually extend leaf states in subsequent rounds efficiently. For the second question, perhaps I didn't understand it completely, but if the entry is nearly at the probability threshold, wouldn't it be "close" to terminating anyway, and thus not extremely useful? – nneonneo Nov 09 '17 at 17:15
  • @nneonneo Sorry, didn't explain well. When looking up a score from the transposition table, you ensure that it comes from a node with a depth less than or equal to the current depth. If you have both probability and depth limiting recursion however, you can't really trust depth alone as an indicator of how much computation went into a score. Consider a path where it gets a string of 4s, so it quickly approaches the probability limit. It is now one level away from terminating, but its depth is low, so future lookups on the the transposition table will assume that it is a reliable cached score. – Martin Cook Nov 09 '17 at 19:18
  • @MartinCook I see. Yes, I guess it could be a problem - I didn't really consider it. I suspect it's mitigated slightly by the fact that such positions are unlikely to be accessible via high-probability routes (although I can't necessarily prove that). Bugfixes for that issue would be totally welcome! – nneonneo Nov 10 '17 at 02:52
  • @nneonneo I'm afraid I don't have a great solution... I made my own 2048 AI code for an office challenge, with some similar design choices, including tracking probability to prune branches. Three options I thought of were 1) to ignore probability in the transposition table, 2) track both depth and probability and test against both when looking up a score, or 3) stop explicitly tracking probability in recursion, and instead increase the current depth when selecting 4s as a crude approximation. Your solution is super-polished, so I wondered if you had some good insight :) – Martin Cook Nov 11 '17 at 16:25
  • I am not completely sure how the algorithm works but can you manually test out the following idea: attach a laziness index to each tile which shows how long a tile has been idle. If a tile moves, it's laziness index is reset. At each round, take a note of all tiles who have been moved and their laziness index has been reset. Spawn the new tile over the tile with highest laziness index (which has been moved). I have implemented the idea here for ease - https://github.com/cyc-func/Rage-2048/ I suspect that some strategy involving rotation works but I am not able to find it. – Confuse Dec 09 '17 at 08:09
  • An updated link posted above by @reverse_engineer : http://reverse-engineer.be/2048ai/ – Evg Aug 12 '18 at 13:38
  • @nneonneo in my first run it did 386932 (using chrome remote on 2488game.com site) :D Well done. – Zibri Feb 13 '19 at 19:29
  • Musician gackt got about 340000 (ex human world record), it took one week. https://ameblo.jp/gackt/entry-12332844599.html – Takahiro Waki Apr 14 '19 at 12:44
  • @TakahiroWaki He appears to be playing it on a version that allows undo - which unfortunately lets you side-step much of the randomness of the game. – nneonneo Apr 16 '19 at 05:44
  • About data storage: If the inner loop of your program handles entire board positions, then, yes, packing an entire position into a machine word is fastest. If the inner loop handles comparisons of patterns or individual tile values, then packing into a word is slowest, since bit extractions are not as fast as array references. Always try to use machine arrays for your innermost loop data references if speed is your goal. – David Spector Jun 04 '20 at 11:58
1287

I'm the author of the AI program that others have mentioned in this thread. You can view the AI in action or read the source.

Currently, the program achieves about a 90% win rate running in javascript in the browser on my laptop given about 100 milliseconds of thinking time per move, so while not perfect (yet!) it performs pretty well.

Since the game is a discrete state space, perfect information, turn-based game like chess and checkers, I used the same methods that have been proven to work on those games, namely minimax search with alpha-beta pruning. Since there is already a lot of info on that algorithm out there, I'll just talk about the two main heuristics that I use in the static evaluation function and which formalize many of the intuitions that other people have expressed here.

Monotonicity

This heuristic tries to ensure that the values of the tiles are all either increasing or decreasing along both the left/right and up/down directions. This heuristic alone captures the intuition that many others have mentioned, that higher valued tiles should be clustered in a corner. It will typically prevent smaller valued tiles from getting orphaned and will keep the board very organized, with smaller tiles cascading in and filling up into the larger tiles.

Here's a screenshot of a perfectly monotonic grid. I obtained this by running the algorithm with the eval function set to disregard the other heuristics and only consider monotonicity.

A perfectly monotonic 2048 board

Smoothness

The above heuristic alone tends to create structures in which adjacent tiles are decreasing in value, but of course in order to merge, adjacent tiles need to be the same value. Therefore, the smoothness heuristic just measures the value difference between neighboring tiles, trying to minimize this count.

A commenter on Hacker News gave an interesting formalization of this idea in terms of graph theory.

Here's a screenshot of a perfectly smooth grid.

A perfectly smooth 2048 board

Free Tiles

And finally, there is a penalty for having too few free tiles, since options can quickly run out when the game board gets too cramped.

And that's it! Searching through the game space while optimizing these criteria yields remarkably good performance. One advantage to using a generalized approach like this rather than an explicitly coded move strategy is that the algorithm can often find interesting and unexpected solutions. If you watch it run, it will often make surprising but effective moves, like suddenly switching which wall or corner it's building up against.

Edit:

Here's a demonstration of the power of this approach. I uncapped the tile values (so it kept going after reaching 2048) and here is the best result after eight trials.

4096

Yes, that's a 4096 alongside a 2048. =) That means it achieved the elusive 2048 tile three times on the same board.

0x263A
  • 1,807
  • 9
  • 22
ovolve
  • 9,191
  • 3
  • 16
  • 13
  • 7
    Why is this a minmax problem? There isn't an opponent to the player. – koo Mar 15 '14 at 01:09
  • 92
    You can treat the computer placing the '2' and '4' tiles as the 'opponent'. – Wei Yen Mar 15 '14 at 02:53
  • I was using the algorithm in action and it actually failed. Check http://imgur.com/RF71yaV . I was able to solve by same logic manually. – amitchhajer Mar 15 '14 at 09:07
  • 31
    @WeiYen Sure, but regarding it as a minmax problem is not faithful to the game logic, because the computer is placing tiles randomly with certain probabilities, rather than intentionally minimising the score. – koo Mar 15 '14 at 14:55
  • 1
    @koo the RNG is the opponent placing the new tiles on the board, granted it doesn't work actively against you so it's only a max problem – ratchet freak Mar 15 '14 at 21:27
  • 62
    Even though the AI is randomly placing the tiles, the goal is not to lose. Getting unlucky is the same thing as the opponent choosing the worst move for you. The "min" part means that you try to play conservatively so that there are no awful moves that you could get unlucky. – FryGuy Mar 16 '14 at 04:17
  • 8
    How ddid you decide on the weights for the different heuristics , such as smoothness and monoticity? – smk Mar 16 '14 at 05:06
  • 1
    @koo: Same for most AIs playing against humans. The human will also not try to minimize the score, yet assuming it gives usually good results. – PlasmaHH Mar 16 '14 at 11:28
  • 3
    @smk I picked the weights mostly by experimentation and observing its behavior. AFAIK there's no more principled way to do it. @ koo and others, there was a lengthy discussion on the topic of worst case vs. expected value on [Hacker news](https://news.ycombinator.com/item?id=7380317) – ovolve Mar 16 '14 at 15:07
  • 3
    Rather than monotonicity, I would argue it is better to arrange the tiles in an augmenting sequence in the shape of a snake: one row left to right, then right to left, then left to right. Check my answer to see how it looks on a perfect sequence from 2 to 4096, in order to create a 8192 tile. – user711413 Mar 16 '14 at 21:06
  • 205
    I had an idea to create a fork of 2048, where the computer instead of placing the 2s and 4s randomly uses your AI to determine where to put the values. The result: sheer impossibleness. Can be tried out here: http://sztupy.github.io/2048-Hard/ – SztupY Mar 17 '14 at 01:03
  • 35
    @SztupY Wow, this is evil. Reminds me of http://qntm.org/hatetris Hatetris, which also tries to place the piece that will improve your situation the least. – Patashu Mar 17 '14 at 02:27
  • @Patashu: yup, I had the same in mind when doing this – SztupY Mar 17 '14 at 11:52
  • 4
    Technically, it's an expectimax problem. Expectimax does subsume minimax, though. – Tony Mar 17 '14 at 13:04
  • 9
    Use this fork if you want the AI working for you. You can impress your friends! haha. http://limoragni.github.io/2048/ – limoragni Mar 18 '14 at 15:27
  • 1
    One funny thing I found (maybe is not funny but actually predictable) is that if I keep pressing two arrows on the corner (say left and down, or right and top), it actually creates a perfect monotony grid. Only due to the random placement of the new tile that is created with each movement, does if fail. Otherwise, it creates the perfect grid automatically. AI is required only when this strategy doesn't produce any movement. See lucky picture: http://es.tinypic.com/r/105tk6d/8 – Camilo Delvasto Mar 18 '14 at 19:58
  • 1
    I'd love to see how much genetic algorithms could improve the static evaluator. I'll have to fork the AI and give it a try! – Daniel Mar 18 '14 at 21:15
  • 2
    Could you add an option to your AI version of the game to make it uncapped, please? I'd like to see that in action. – bodo Mar 20 '14 at 09:50
  • 1
    @MaxRied: You can't win 9007199254740992 in any reasonable amount of time. Executing 1,000,000 moves per second, it will still take 120 years to finish that game. – nneonneo Mar 30 '14 at 17:11
  • 1
    I think this answer is actually misleading. It's not a minimax and the ordering of heuristics is off. Free tiles is a much better indicator of a good board then the other two mentioned. – wvdz Apr 04 '14 at 00:20
  • Doesn't `Smoothness` subsume `Monicity`? If you have an unmonotonic tiling and change it to a monotonic one, clearly you increase smoothness? – Thomas Ahle Apr 06 '14 at 13:31
  • 1
    Can you mention that, for now, 'nneonneo' answer has superior results? it is confusing that this has the most votes – Nay Apr 08 '14 at 17:16
  • Did I get it right that for each move that the player can do you consider every possible answer by the computer? So if there are 10 empty tiles you have to consider 20 possible answers. A very high branching factor for JS. How deep do you get in those 100 ms? – principal-ideal-domain Apr 15 '14 at 21:22
  • @SztupY how about variant where the AI places the 2's and 4's in the "luckiest" place possible, then changing the goal of the game to end on the lowest score possible? – Martijn Jun 03 '14 at 11:49
  • it is not an "generalized approach", an agi would be generalized approach, but not any algorithm with heuristics. – Quonux Jun 14 '14 at 00:52
  • 1
    Guys from the answer below [reached the 32k tile](https://github.com/nneonneo/2048-ai/pull/27)! – reverse_engineer Jun 16 '14 at 08:38
  • @ovolve you are awesome !. I personally feel the way 2 and 4 number are generating and placing that way is totally depends on my luck currently this is not something test user's logic .The game is great game . – dharmendra Jun 23 '14 at 14:00
  • Amateurs ;-) I did this by hand: https://www.dropbox.com/s/h419ymz4zgj9e5f/Screenshot%202015-03-20%2022.22.15.png?dl=0 – Theodore R. Smith Mar 21 '15 at 11:18
  • 1
    Could someone explain how he computes grid monotonicity? I don't understand the scoring system used in `monotonicity2()`. – dazedviper Jul 01 '16 at 12:26
168

I became interested in the idea of an AI for this game containing no hard-coded intelligence (i.e no heuristics, scoring functions etc). The AI should "know" only the game rules, and "figure out" the game play. This is in contrast to most AIs (like the ones in this thread) where the game play is essentially brute force steered by a scoring function representing human understanding of the game.

AI Algorithm

I found a simple yet surprisingly good playing algorithm: To determine the next move for a given board, the AI plays the game in memory using random moves until the game is over. This is done several times while keeping track of the end game score. Then the average end score per starting move is calculated. The starting move with the highest average end score is chosen as the next move.

With just 100 runs (i.e in memory games) per move, the AI achieves the 2048 tile 80% of the times and the 4096 tile 50% of the times. Using 10000 runs gets the 2048 tile 100%, 70% for 4096 tile, and about 1% for the 8192 tile.

See it in action

The best achieved score is shown here:

best score

An interesting fact about this algorithm is that while the random-play games are unsurprisingly quite bad, choosing the best (or least bad) move leads to very good game play: A typical AI game can reach 70000 points and last 3000 moves, yet the in-memory random play games from any given position yield an average of 340 additional points in about 40 extra moves before dying. (You can see this for yourself by running the AI and opening the debug console.)

This graph illustrates this point: The blue line shows the board score after each move. The red line shows the algorithm's best random-run end game score from that position. In essence, the red values are "pulling" the blue values upwards towards them, as they are the algorithm's best guess. It's interesting to see the red line is just a tiny bit above the blue line at each point, yet the blue line continues to increase more and more.

scoring graph

I find it quite surprising that the algorithm doesn't need to actually foresee good game play in order to chose the moves that produce it.

Searching later I found this algorithm might be classified as a Pure Monte Carlo Tree Search algorithm.

Implementation and Links

First I created a JavaScript version which can be seen in action here. This version can run 100's of runs in decent time. Open the console for extra info. (source)

Later, in order to play around some more I used @nneonneo highly optimized infrastructure and implemented my version in C++. This version allows for up to 100000 runs per move and even 1000000 if you have the patience. Building instructions provided. It runs in the console and also has a remote-control to play the web version. (source)

Results

Surprisingly, increasing the number of runs does not drastically improve the game play. There seems to be a limit to this strategy at around 80000 points with the 4096 tile and all the smaller ones, very close to the achieving the 8192 tile. Increasing the number of runs from 100 to 100000 increases the odds of getting to this score limit (from 5% to 40%) but not breaking through it.

Running 10000 runs with a temporary increase to 1000000 near critical positions managed to break this barrier less than 1% of the times achieving a max score of 129892 and the 8192 tile.

Improvements

After implementing this algorithm I tried many improvements including using the min or max scores, or a combination of min,max,and avg. I also tried using depth: Instead of trying K runs per move, I tried K moves per move list of a given length ("up,up,left" for example) and selecting the first move of the best scoring move list.

Later I implemented a scoring tree that took into account the conditional probability of being able to play a move after a given move list.

However, none of these ideas showed any real advantage over the simple first idea. I left the code for these ideas commented out in the C++ code.

I did add a "Deep Search" mechanism that increased the run number temporarily to 1000000 when any of the runs managed to accidentally reach the next highest tile. This offered a time improvement.

I'd be interested to hear if anyone has other improvement ideas that maintain the domain-independence of the AI.

2048 Variants and Clones

Just for fun, I've also implemented the AI as a bookmarklet, hooking into the game's controls. This allows the AI to work with the original game and many of its variants.

This is possible due to domain-independent nature of the AI. Some of the variants are quite distinct, such as the Hexagonal clone.

Ronenz
  • 2,048
  • 2
  • 16
  • 8
  • 7
    +1. As an AI student I found this really interesting. Will take a better look at this in the free time. – Isaac May 25 '14 at 22:18
  • 5
    This is amazing! I just spent hours optimizing weights for a good heuristic function for expectimax and I implement this in 3 minutes and this completely smashes it. – Brendan Annable May 29 '14 at 17:09
  • 11
    Nice use of Monte Carlo simulation. – nneonneo Jun 10 '14 at 04:22
  • 1
    Bookmarklet ftw! Great for testing features. Thanks for the work! :) – ikaruss Jun 22 '14 at 23:48
  • 7
    Watching this playing is calling for an enlightenment. This blows all heuristics and yet it works. Congratulations ! – Stéphane Gourichon Jul 23 '14 at 20:03
  • 1
    One significant improvement, with around 15 minutes of work, will be using better estimation formula, I believe. If you are using *Pure* MCTS, I would suggest using the [UCT](http://en.wikipedia.org/wiki/Monte_Carlo_tree_search#Exploration_and_exploitation) formula, which is beautifully simple, but expands the decision tree in much better way. – Adam Stelmaszczyk Mar 07 '15 at 21:31
  • 5
    By far, the most interesting solution here. – shebaw Jul 05 '15 at 07:18
  • 1
    I played around with this idea some more and did a bit of analysis if anyone is interested https://github.com/silverstar194/2048-ai-monte-carlo – John Down Jan 02 '17 at 04:35
  • 1
    Nice work @JohnDown! It would be nice of you if you referenced my work. – Ronenz Jan 03 '17 at 07:17
  • Excellent. What actually determines the best score at each node? Is it number of tile (similar value) joined? – WailenB Aug 09 '22 at 20:37
130

EDIT: This is a naive algorithm, modelling human conscious thought process, and gets very weak results compared to AI that search all possibilities since it only looks one tile ahead. It was submitted early in the response timeline.

I have refined the algorithm and beaten the game! It may fail due to simple bad luck close to the end (you are forced to move down, which you should never do, and a tile appears where your highest should be. Just try to keep the top row filled, so moving left does not break the pattern), but basically you end up having a fixed part and a mobile part to play with. This is your objective:

Ready to finish

This is the model I chose by default.

1024 512 256 128
  8   16  32  64
  4   2   x   x
  x   x   x   x

The chosen corner is arbitrary, you basically never press one key (the forbidden move), and if you do, you press the contrary again and try to fix it. For future tiles the model always expects the next random tile to be a 2 and appear on the opposite side to the current model (while the first row is incomplete, on the bottom right corner, once the first row is completed, on the bottom left corner).

Here goes the algorithm. Around 80% wins (it seems it is always possible to win with more "professional" AI techniques, I am not sure about this, though.)

initiateModel();

while(!game_over)
{    
    checkCornerChosen(); // Unimplemented, but it might be an improvement to change the reference point

    for each 3 possible move:
        evaluateResult()
    execute move with best score
    if no move is available, execute forbidden move and undo, recalculateModel()
 }

 evaluateResult() {
     calculatesBestCurrentModel()
     calculates distance to chosen model
     stores result
 }

 calculateBestCurrentModel() {
      (according to the current highest tile acheived and their distribution)
  }

A few pointers on the missing steps. Here: model change

The model has changed due to the luck of being closer to the expected model. The model the AI is trying to achieve is

 512 256 128  x
  X   X   x   x
  X   X   x   x
  x   x   x   x

And the chain to get there has become:

 512 256  64  O
  8   16  32  O
  4   x   x   x
  x   x   x   x

The O represent forbidden spaces...

So it will press right, then right again, then (right or top depending on where the 4 has created) then will proceed to complete the chain until it gets:

Chain completed

So now the model and chain are back to:

 512 256 128  64
  4   8  16   32
  X   X   x   x
  x   x   x   x

Second pointer, it has had bad luck and its main spot has been taken. It is likely that it will fail, but it can still achieve it:

Enter image description here

Here the model and chain is:

  O 1024 512 256
  O   O   O  128
  8  16   32  64
  4   x   x   x

When it manages to reach the 128 it gains a whole row is gained again:

  O 1024 512 256
  x   x  128 128
  x   x   x   x
  x   x   x   x
Daren
  • 3,337
  • 4
  • 21
  • 35
  • `execute move with best score` how can you evaluate the best score out of the possible next states ? – Khaled.K Mar 12 '14 at 17:05
  • the heuristic is defined in `evaluateResult` you basically try to get closest to the best possible scenario. – Daren Mar 12 '14 at 17:12
  • @Daren I'm waiting for your detailed specifics – piedpiper Mar 12 '14 at 22:22
  • @ashu I'm working on it, unexpected circumstances have left me without time to finish it. Meanwhile I have improved the algorithm and it now solves it 75% of the time. – Daren Mar 13 '14 at 09:51
  • @ashu finally got home, and it is completed! – Daren Mar 13 '14 at 20:12
  • 14
    What I really like about this strategy is that I am able to use it when playing the game manually, it got me up to 37k points. – Cephalopod Apr 01 '14 at 18:43
  • Can you make some statistics on how often you get 1024, 2048, 4096 etc.? I think that's a good measure for comparing the different engines. Together with movespeed of course. – Thomas Ahle Apr 08 '14 at 21:47
  • mine sucks compared to the real others... :( I am tempted to delete the answer. It might be the best of the naive, still so much worse than typical AI implementations. This was posted at the very beginning of the question. nneonneo answer's the current best IMHO. – Daren Apr 09 '14 at 10:53
  • @Daren Have you hosted your code in any external repositories(github, code.google..)? Please paste link if you have done. – Antony Thomas May 06 '14 at 02:27
  • @Daren are you crazy, don't delete ! for science dude. don't be ashamed that you haven't used a inityminmax-o-duplex-solver™ whatever with reinforcement learning; simple solutions makes often the best business sense. Also what Cephalopod said. – v.oddou Dec 26 '18 at 03:26
102

I copy here the content of a post on my blog


The solution I propose is very simple and easy to implement. Although, it has reached the score of 131040. Several benchmarks of the algorithm performances are presented.

Score

Algorithm

Heuristic scoring algorithm

The assumption on which my algorithm is based is rather simple: if you want to achieve higher score, the board must be kept as tidy as possible. In particular, the optimal setup is given by a linear and monotonic decreasing order of the tile values. This intuition will give you also the upper bound for a tile value: s where n is the number of tile on the board.

(There's a possibility to reach the 131072 tile if the 4-tile is randomly generated instead of the 2-tile when needed)

Two possible ways of organizing the board are shown in the following images:

enter image description here

To enforce the ordination of the tiles in a monotonic decreasing order, the score si computed as the sum of the linearized values on the board multiplied by the values of a geometric sequence with common ratio r<1 .

s

s

Several linear path could be evaluated at once, the final score will be the maximum score of any path.

Decision rule

The decision rule implemented is not quite smart, the code in Python is presented here:

@staticmethod
def nextMove(board,recursion_depth=3):
    m,s = AI.nextMoveRecur(board,recursion_depth,recursion_depth)
    return m

@staticmethod
def nextMoveRecur(board,depth,maxDepth,base=0.9):
    bestScore = -1.
    bestMove = 0
    for m in range(1,5):
        if(board.validMove(m)):
            newBoard = copy.deepcopy(board)
            newBoard.move(m,add_tile=True)

            score = AI.evaluate(newBoard)
            if depth != 0:
                my_m,my_s = AI.nextMoveRecur(newBoard,depth-1,maxDepth)
                score += my_s*pow(base,maxDepth-depth+1)

            if(score > bestScore):
                bestMove = m
                bestScore = score
    return (bestMove,bestScore);

An implementation of the minmax or the Expectiminimax will surely improve the algorithm. Obviously a more sophisticated decision rule will slow down the algorithm and it will require some time to be implemented.I will try a minimax implementation in the near future. (stay tuned)

Benchmark

  • T1 - 121 tests - 8 different paths - r=0.125
  • T2 - 122 tests - 8-different paths - r=0.25
  • T3 - 132 tests - 8-different paths - r=0.5
  • T4 - 211 tests - 2-different paths - r=0.125
  • T5 - 274 tests - 2-different paths - r=0.25
  • T6 - 211 tests - 2-different paths - r=0.5

enter image description here enter image description here enter image description here enter image description here

In case of T2, four tests in ten generate the 4096 tile with an average score of s 42000

Code

The code can be found on GiHub at the following link: https://github.com/Nicola17/term2048-AI It is based on term2048 and it's written in Python. I will implement a more efficient version in C++ as soon as possible.

Nicola Pezzotti
  • 2,317
  • 1
  • 16
  • 26
  • Not bad, your illustration has given me an idea, of taking the merge vectors into evaluation – Khaled.K Apr 08 '14 at 08:45
  • Hello. Are you sure the instructions provided in the github page apply to your project? I want to give it a try but those seem to be the instructions for the original playable game and not the AI autorun. Could you update those? Thanks. – JD Gamboa Nov 06 '18 at 03:43
49

My attempt uses expectimax like other solutions above, but without bitboards. Nneonneo's solution can check 10millions of moves which is approximately a depth of 4 with 6 tiles left and 4 moves possible (2*6*4)4. In my case, this depth takes too long to explore, I adjust the depth of expectimax search according to the number of free tiles left:

depth = free > 7 ? 1 : (free > 4 ? 2 : 3)

The scores of the boards are computed with the weighted sum of the square of the number of free tiles and the dot product of the 2D grid with this:

[[10,8,7,6.5],
 [.5,.7,1,3],
 [-.5,-1.5,-1.8,-2],
 [-3.8,-3.7,-3.5,-3]]

which forces to organize tiles descendingly in a sort of snake from the top left tile.

code below or on github:

var n = 4,
 M = new MatrixTransform(n);

var ai = {weights: [1, 1], depth: 1}; // depth=1 by default, but we adjust it on every prediction according to the number of free tiles

var snake= [[10,8,7,6.5],
            [.5,.7,1,3],
            [-.5,-1.5,-1.8,-2],
            [-3.8,-3.7,-3.5,-3]]
snake=snake.map(function(a){return a.map(Math.exp)})

initialize(ai)

function run(ai) {
 var p;
 while ((p = predict(ai)) != null) {
  move(p, ai);
 }
 //console.log(ai.grid , maxValue(ai.grid))
 ai.maxValue = maxValue(ai.grid)
 console.log(ai)
}

function initialize(ai) {
 ai.grid = [];
 for (var i = 0; i < n; i++) {
  ai.grid[i] = []
  for (var j = 0; j < n; j++) {
   ai.grid[i][j] = 0;
  }
 }
 rand(ai.grid)
 rand(ai.grid)
 ai.steps = 0;
}

function move(p, ai) { //0:up, 1:right, 2:down, 3:left
 var newgrid = mv(p, ai.grid);
 if (!equal(newgrid, ai.grid)) {
  //console.log(stats(newgrid, ai.grid))
  ai.grid = newgrid;
  try {
   rand(ai.grid)
   ai.steps++;
  } catch (e) {
   console.log('no room', e)
  }
 }
}

function predict(ai) {
 var free = freeCells(ai.grid);
 ai.depth = free > 7 ? 1 : (free > 4 ? 2 : 3);
 var root = {path: [],prob: 1,grid: ai.grid,children: []};
 var x = expandMove(root, ai)
 //console.log("number of leaves", x)
 //console.log("number of leaves2", countLeaves(root))
 if (!root.children.length) return null
 var values = root.children.map(expectimax);
 var mx = max(values);
 return root.children[mx[1]].path[0]

}

function countLeaves(node) {
 var x = 0;
 if (!node.children.length) return 1;
 for (var n of node.children)
  x += countLeaves(n);
 return x;
}

function expectimax(node) {
 if (!node.children.length) {
  return node.score
 } else {
  var values = node.children.map(expectimax);
  if (node.prob) { //we are at a max node
   return Math.max.apply(null, values)
  } else { // we are at a random node
   var avg = 0;
   for (var i = 0; i < values.length; i++)
    avg += node.children[i].prob * values[i]
   return avg / (values.length / 2)
  }
 }
}

function expandRandom(node, ai) {
 var x = 0;
 for (var i = 0; i < node.grid.length; i++)
  for (var j = 0; j < node.grid.length; j++)
   if (!node.grid[i][j]) {
    var grid2 = M.copy(node.grid),
     grid4 = M.copy(node.grid);
    grid2[i][j] = 2;
    grid4[i][j] = 4;
    var child2 = {grid: grid2,prob: .9,path: node.path,children: []};
    var child4 = {grid: grid4,prob: .1,path: node.path,children: []}
    node.children.push(child2)
    node.children.push(child4)
    x += expandMove(child2, ai)
    x += expandMove(child4, ai)
   }
 return x;
}

function expandMove(node, ai) { // node={grid,path,score}
 var isLeaf = true,
  x = 0;
 if (node.path.length < ai.depth) {
  for (var move of[0, 1, 2, 3]) {
   var grid = mv(move, node.grid);
   if (!equal(grid, node.grid)) {
    isLeaf = false;
    var child = {grid: grid,path: node.path.concat([move]),children: []}
    node.children.push(child)
    x += expandRandom(child, ai)
   }
  }
 }
 if (isLeaf) node.score = dot(ai.weights, stats(node.grid))
 return isLeaf ? 1 : x;
}



var cells = []
var table = document.querySelector("table");
for (var i = 0; i < n; i++) {
 var tr = document.createElement("tr");
 cells[i] = [];
 for (var j = 0; j < n; j++) {
  cells[i][j] = document.createElement("td");
  tr.appendChild(cells[i][j])
 }
 table.appendChild(tr);
}

function updateUI(ai) {
 cells.forEach(function(a, i) {
  a.forEach(function(el, j) {
   el.innerHTML = ai.grid[i][j] || ''
  })
 });
}


updateUI(ai);
updateHint(predict(ai));

function runAI() {
 var p = predict(ai);
 if (p != null && ai.running) {
  move(p, ai);
  updateUI(ai);
  updateHint(p);
  requestAnimationFrame(runAI);
 }
}
runai.onclick = function() {
 if (!ai.running) {
  this.innerHTML = 'stop AI';
  ai.running = true;
  runAI();
 } else {
  this.innerHTML = 'run AI';
  ai.running = false;
  updateHint(predict(ai));
 }
}


function updateHint(dir) {
 hintvalue.innerHTML = ['↑', '→', '↓', '←'][dir] || '';
}

document.addEventListener("keydown", function(event) {
 if (!event.target.matches('.r *')) return;
 event.preventDefault(); // avoid scrolling
 if (event.which in map) {
  move(map[event.which], ai)
  console.log(stats(ai.grid))
  updateUI(ai);
  updateHint(predict(ai));
 }
})
var map = {
 38: 0, // Up
 39: 1, // Right
 40: 2, // Down
 37: 3, // Left
};
init.onclick = function() {
 initialize(ai);
 updateUI(ai);
 updateHint(predict(ai));
}


function stats(grid, previousGrid) {

 var free = freeCells(grid);

 var c = dot2(grid, snake);

 return [c, free * free];
}

function dist2(a, b) { //squared 2D distance
 return Math.pow(a[0] - b[0], 2) + Math.pow(a[1] - b[1], 2)
}

function dot(a, b) {
 var r = 0;
 for (var i = 0; i < a.length; i++)
  r += a[i] * b[i];
 return r
}

function dot2(a, b) {
 var r = 0;
 for (var i = 0; i < a.length; i++)
  for (var j = 0; j < a[0].length; j++)
   r += a[i][j] * b[i][j]
 return r;
}

function product(a) {
 return a.reduce(function(v, x) {
  return v * x
 }, 1)
}

function maxValue(grid) {
 return Math.max.apply(null, grid.map(function(a) {
  return Math.max.apply(null, a)
 }));
}

function freeCells(grid) {
 return grid.reduce(function(v, a) {
  return v + a.reduce(function(t, x) {
   return t + (x == 0)
  }, 0)
 }, 0)
}

function max(arr) { // return [value, index] of the max
 var m = [-Infinity, null];
 for (var i = 0; i < arr.length; i++) {
  if (arr[i] > m[0]) m = [arr[i], i];
 }
 return m
}

function min(arr) { // return [value, index] of the min
 var m = [Infinity, null];
 for (var i = 0; i < arr.length; i++) {
  if (arr[i] < m[0]) m = [arr[i], i];
 }
 return m
}

function maxScore(nodes) {
 var min = {
  score: -Infinity,
  path: []
 };
 for (var node of nodes) {
  if (node.score > min.score) min = node;
 }
 return min;
}


function mv(k, grid) {
 var tgrid = M.itransform(k, grid);
 for (var i = 0; i < tgrid.length; i++) {
  var a = tgrid[i];
  for (var j = 0, jj = 0; j < a.length; j++)
   if (a[j]) a[jj++] = (j < a.length - 1 && a[j] == a[j + 1]) ? 2 * a[j++] : a[j]
  for (; jj < a.length; jj++)
   a[jj] = 0;
 }
 return M.transform(k, tgrid);
}

function rand(grid) {
 var r = Math.floor(Math.random() * freeCells(grid)),
  _r = 0;
 for (var i = 0; i < grid.length; i++) {
  for (var j = 0; j < grid.length; j++) {
   if (!grid[i][j]) {
    if (_r == r) {
     grid[i][j] = Math.random() < .9 ? 2 : 4
    }
    _r++;
   }
  }
 }
}

function equal(grid1, grid2) {
 for (var i = 0; i < grid1.length; i++)
  for (var j = 0; j < grid1.length; j++)
   if (grid1[i][j] != grid2[i][j]) return false;
 return true;
}

function conv44valid(a, b) {
 var r = 0;
 for (var i = 0; i < 4; i++)
  for (var j = 0; j < 4; j++)
   r += a[i][j] * b[3 - i][3 - j]
 return r
}

function MatrixTransform(n) {
 var g = [],
  ig = [];
 for (var i = 0; i < n; i++) {
  g[i] = [];
  ig[i] = [];
  for (var j = 0; j < n; j++) {
   g[i][j] = [[j, i],[i, n-1-j],[j, n-1-i],[i, j]]; // transformation matrix in the 4 directions g[i][j] = [up, right, down, left]
   ig[i][j] = [[j, i],[i, n-1-j],[n-1-j, i],[i, j]]; // the inverse tranformations
  }
 }
 this.transform = function(k, grid) {
  return this.transformer(k, grid, g)
 }
 this.itransform = function(k, grid) { // inverse transform
  return this.transformer(k, grid, ig)
 }
 this.transformer = function(k, grid, mat) {
  var newgrid = [];
  for (var i = 0; i < grid.length; i++) {
   newgrid[i] = [];
   for (var j = 0; j < grid.length; j++)
    newgrid[i][j] = grid[mat[i][j][k][0]][mat[i][j][k][1]];
  }
  return newgrid;
 }
 this.copy = function(grid) {
  return this.transform(3, grid)
 }
}
body {
 font-family: Arial;
}
table, th, td {
 border: 1px solid black;
 margin: 0 auto;
 border-collapse: collapse;
}
td {
 width: 35px;
 height: 35px;
 text-align: center;
}
button {
 margin: 2px;
 padding: 3px 15px;
 color: rgba(0,0,0,.9);
}
.r {
 display: flex;
 align-items: center;
 justify-content: center;
 margin: .2em;
 position: relative;
}
#hintvalue {
 font-size: 1.4em;
 padding: 2px 8px;
 display: inline-flex;
 justify-content: center;
 width: 30px;
}
<table title="press arrow keys"></table>
<div class="r">
    <button id=init>init</button>
    <button id=runai>run AI</button>
    <span id="hintvalue" title="Best predicted move to do, use your arrow keys" tabindex="-1"></span>
</div>
caub
  • 2,709
  • 2
  • 28
  • 31
49

I am the author of a 2048 controller that scores better than any other program mentioned in this thread. An efficient implementation of the controller is available on github. In a separate repo there is also the code used for training the controller's state evaluation function. The training method is described in the paper.

The controller uses expectimax search with a state evaluation function learned from scratch (without human 2048 expertise) by a variant of temporal difference learning (a reinforcement learning technique). The state-value function uses an n-tuple network, which is basically a weighted linear function of patterns observed on the board. It involved more than 1 billion weights, in total.

Performance

At 1 moves/s: 609104 (100 games average)

At 10 moves/s: 589355 (300 games average)

At 3-ply (ca. 1500 moves/s): 511759 (1000 games average)

The tile statistics for 10 moves/s are as follows:

2048: 100%
4096: 100%
8192: 100%
16384: 97%
32768: 64%
32768,16384,8192,4096: 10%

(The last line means having the given tiles at the same time on the board).

For 3-ply:

2048: 100%
4096: 100%
8192: 100%
16384: 96%
32768: 54%
32768,16384,8192,4096: 8%

However, I have never observed it obtaining the 65536 tile.

cauchy
  • 1,763
  • 1
  • 13
  • 12
  • 6
    Pretty impressive result. However could you possibly update the answer to explain (roughly, in simple terms... I'm sure the full details would be too long to post here) how your program achieves this? As in a rough explanation of how the learning algorithm works? – Cedric Mamo Jan 27 '16 at 20:32
28

I think I found an algorithm which works quite well, as I often reach scores over 10000, my personal best being around 16000. My solution does not aim at keeping biggest numbers in a corner, but to keep it in the top row.

Please see the code below:

while( !game_over ) {
    move_direction=up;
    if( !move_is_possible(up) ) {
        if( move_is_possible(right) && move_is_possible(left) ){
            if( number_of_empty_cells_after_moves(left,up) > number_of_empty_cells_after_moves(right,up) ) 
                move_direction = left;
            else
                move_direction = right;
        } else if ( move_is_possible(left) ){
            move_direction = left;
        } else if ( move_is_possible(right) ){
            move_direction = right;
        } else {
            move_direction = down;
        }
    }
    do_move(move_direction);
}
Venugopal Madathil
  • 2,031
  • 3
  • 34
  • 44
  • 5
    I ran 100,000 games testing this versus the trivial cyclic strategy "up, right, up, left, ..." (and down if it must). The cyclic strategy finished an "average tile score" of `770.6`, while this one got just `396.7`. Do you have a guess why that might be? I'm thinking it does too many ups, even when left or right would merge a lot more. – Thomas Ahle Apr 06 '14 at 13:49
  • 1
    The tiles tend to stack in incompatible ways if they are not shifted in multiple directions. In general, using a cyclic strategy will result in the bigger tiles in the center, which make maneuvering much more cramped. – bcdan Jun 23 '15 at 23:44
25

There is already an AI implementation for this game here. Excerpt from README:

The algorithm is iterative deepening depth first alpha-beta search. The evaluation function tries to keep the rows and columns monotonic (either all decreasing or increasing) while minimizing the number of tiles on the grid.

There is also a discussion on Hacker News about this algorithm that you may find useful.

MultiplyByZer0
  • 6,302
  • 3
  • 32
  • 48
baltazar
  • 1,659
  • 2
  • 18
  • 23
  • 5
    This should be the top answer, but it would be nice to add more details about the implementation: e.g. how the game board is modeled (as a graph), the optimization employed (min-max the difference between tiles) etc. – Alceu Costa Mar 13 '14 at 19:44
  • 3
    **For future readers:** This is the same program explained by its author (ovolve) in the [second-topmost answer](https://stackoverflow.com/a/22389702/2688027) here. This answer, and other mentions of ovolve's program in this discussion, prompted ovolve to appear and write up how his algorithm worked; that answer now has a score of 1200. – MultiplyByZer0 Dec 30 '18 at 06:49
23

Algorithm

while(!game_over)
{
    for each possible move:
        evaluate next state

    choose the maximum evaluation
}

Evaluation

Evaluation =
    128 (Constant)
    + (Number of Spaces x 128)
    + Sum of faces adjacent to a space { (1/face) x 4096 }
    + Sum of other faces { log(face) x 4 }
    + (Number of possible next moves x 256)
    + (Number of aligned values x 2)

Evaluation Details

128 (Constant)

This is a constant, used as a base-line and for other uses like testing.

+ (Number of Spaces x 128)

More spaces makes the state more flexible, we multiply by 128 (which is the median) since a grid filled with 128 faces is an optimal impossible state.

+ Sum of faces adjacent to a space { (1/face) x 4096 }

Here we evaluate faces that have the possibility to getting to merge, by evaluating them backwardly, tile 2 become of value 2048, while tile 2048 is evaluated 2.

+ Sum of other faces { log(face) x 4 }

In here we still need to check for stacked values, but in a lesser way that doesn't interrupt the flexibility parameters, so we have the sum of { x in [4,44] }.

+ (Number of possible next moves x 256)

A state is more flexible if it has more freedom of possible transitions.

+ (Number of aligned values x 2)

This is a simplified check of the possibility of having merges within that state, without making a look-ahead.

Note: The constants can be tweaked..

Khaled.K
  • 5,828
  • 1
  • 33
  • 51
  • 2
    I will edit this later, to add a live code @nitish712 – Khaled.K Mar 12 '14 at 20:16
  • 9
    What is the win% of this algorithm? – cegprakash Mar 15 '14 at 06:17
  • Why do you need a `constant`? If all you are doing is comparing scores, how does that affect the outcome of those comparisons? – bcdan Jun 23 '15 at 23:45
  • @bcdan the heuristic (aka comparison-score) depends on comparing the expected value of future state, similar to how chess heuristics work, except this is a linear heuristic, since we don't build a tree to know the best next N moves – Khaled.K Jun 24 '15 at 09:12
13

This is not a direct answer to OP's question, this is more of the stuffs (experiments) I tried so far to solve the same problem and obtained some results and have some observations that I want to share, I am curious if we can have some further insights from this.

I just tried my minimax implementation with alpha-beta pruning with search-tree depth cutoff at 3 and 5. I was trying to solve the same problem for a 4x4 grid as a project assignment for the edX course ColumbiaX: CSMM.101x Artificial Intelligence (AI).

I applied convex combination (tried different heuristic weights) of couple of heuristic evaluation functions, mainly from intuition and from the ones discussed above:

  1. Monotonicity
  2. Free Space Available

In my case, the computer player is completely random, but still i assumed adversarial settings and implemented the AI player agent as the max player.

I have 4x4 grid for playing the game.

Observation:

If I assign too much weights to the first heuristic function or the second heuristic function, both the cases the scores the AI player gets are low. I played with many possible weight assignments to the heuristic functions and take a convex combination, but very rarely the AI player is able to score 2048. Most of the times it either stops at 1024 or 512.

I also tried the corner heuristic, but for some reason it makes the results worse, any intuition why?

Also, I tried to increase the search depth cut-off from 3 to 5 (I can't increase it more since searching that space exceeds allowed time even with pruning) and added one more heuristic that looks at the values of adjacent tiles and gives more points if they are merge-able, but still I am not able to get 2048.

I think it will be better to use Expectimax instead of minimax, but still I want to solve this problem with minimax only and obtain high scores such as 2048 or 4096. I am not sure whether I am missing anything.

Below animation shows the last few steps of the game played by the AI agent with the computer player:

enter image description here

Any insights will be really very helpful, thanks in advance. (This is the link of my blog post for the article: https://sandipanweb.wordpress.com/2017/03/06/using-minimax-with-alpha-beta-pruning-and-heuristic-evaluation-to-solve-2048-game-with-computer/ and the youtube video: https://www.youtube.com/watch?v=VnVFilfZ0r4)

The following animation shows the last few steps of the game played where the AI player agent could get 2048 scores, this time adding the absolute value heuristic too:

enter image description here

The following figures show the game tree explored by the player AI agent assuming the computer as adversary for just a single step:

enter image description here enter image description here enter image description here enter image description here enter image description here enter image description here

Community
  • 1
  • 1
Sandipan Dey
  • 21,482
  • 2
  • 51
  • 63
9

I wrote a 2048 solver in Haskell, mainly because I'm learning this language right now.

My implementation of the game slightly differs from the actual game, in that a new tile is always a '2' (rather than 90% 2 and 10% 4). And that the new tile is not random, but always the first available one from the top left. This variant is also known as Det 2048.

As a consequence, this solver is deterministic.

I used an exhaustive algorithm that favours empty tiles. It performs pretty quickly for depth 1-4, but on depth 5 it gets rather slow at a around 1 second per move.

Below is the code implementing the solving algorithm. The grid is represented as a 16-length array of Integers. And scoring is done simply by counting the number of empty squares.

bestMove :: Int -> [Int] -> Int
bestMove depth grid = maxTuple [ (gridValue depth (takeTurn x grid), x) | x <- [0..3], takeTurn x grid /= [] ]

gridValue :: Int -> [Int] -> Int
gridValue _ [] = -1
gridValue 0 grid = length $ filter (==0) grid  -- <= SCORING
gridValue depth grid = maxInList [ gridValue (depth-1) (takeTurn x grid) | x <- [0..3] ]

I thinks it's quite successful for its simplicity. The result it reaches when starting with an empty grid and solving at depth 5 is:

Move 4006
[2,64,16,4]
[16,4096,128,512]
[2048,64,1024,16]
[2,4,16,2]

Game Over

Source code can be found here: https://github.com/popovitsj/2048-haskell

nneonneo
  • 171,345
  • 36
  • 312
  • 383
wvdz
  • 16,251
  • 4
  • 53
  • 90
  • Try to extend it with the actual rules. It's a good challenge in learning about Haskell's random generator! – Thomas Ahle Apr 09 '14 at 07:08
  • I got very frustrated with Haskell trying to do that, but I'm probably gonna give it a second try! I did find that the game gets considerably easier without the randomization. – wvdz Apr 09 '14 at 08:19
  • Without randomization I'm pretty sure you could find a way to always get 16k or 32k. However randomization in Haskell is not that bad, you just need a way to pass around the `seed'. Either do it explicitly, or with the Random monad. – Thomas Ahle Apr 09 '14 at 09:06
  • Refining the algorithm so that it always reaches 16k/32k for a non-random game might be another interesting challenge... – wvdz Apr 09 '14 at 10:17
  • You are right, it's harder than I thought. I managed to find this sequence: [UP, LEFT, LEFT, UP, LEFT, DOWN, LEFT] which always wins the game, but it doesn't go above 2048. (In case of no legal move, the cycle algorithm just chooses the next one in clockwise order) – Thomas Ahle Apr 09 '14 at 12:39
  • You do *not* have to implement randomization. If you do a tree search (a la expectimax) you can just perform expectation across all random states. – nneonneo Jun 10 '14 at 04:20
3

This algorithm is not optimal for winning the game, but it is fairly optimal in terms of performance and amount of code needed:

  if(can move neither right, up or down)
    direction = left
  else
  {
    do
    {
      direction = random from (right, down, up)
    }
    while(can not move in "direction")
  }
API-Beast
  • 723
  • 5
  • 10
  • 10
    it works better if you say `random from (right, right, right, down, down, up) ` so not all moves are of equal probability. :) – Daren Mar 20 '14 at 15:48
  • 3
    Actually, if you are completely new to the game, it really helps to only use 3 keys, basically what this algorithm does. So not as bad as it seems at first sight. – Digits Mar 20 '14 at 18:35
  • 5
    Yes, it is based on my own observation with the game. Until you have to use the 4th direction the game will practically solve itself without any kind of observation. This "AI" should be able to get to 512/1024 without checking the exact value of any block. – API-Beast Mar 20 '14 at 18:44
  • 3
    A proper AI would try to avoid getting to a state where it can only move into one direction at all cost. – API-Beast Mar 20 '14 at 18:47
  • 3
    Using only 3 directions actually is a very decent strategy! It just got me nearly to the 2048 playing the game manually. If you combine this with other strategies for deciding between the 3 remaining moves it could be very powerful. Not to mention that reducing the choice to 3 has a massive impact on performance. – wvdz Apr 04 '14 at 23:32
  • @justhalf May be this answer doesn't show the same level of effort as the other top rated answers - there's no analysis, no percentages... just that the user should stick to 3 directions – Anurag Apr 16 '14 at 16:07
  • It is not even compiled under a standard English Language compiler – 0x90 Apr 22 '14 at 23:46
3

Many of the other answers use AI with computationally expensive searching of possible futures, heuristics, learning and the such. These are impressive and probably the correct way forward, but I wish to contribute another idea.

Model the sort of strategy that good players of the game use.

For example:

13 14 15 16
12 11 10  9
 5  6  7  8
 4  3  2  1

Read the squares in the order shown above until the next squares value is greater than the current one. This presents the problem of trying to merge another tile of the same value into this square.

To resolve this problem, their are 2 ways to move that aren't left or worse up and examining both possibilities may immediately reveal more problems, this forms a list of dependancies, each problem requiring another problem to be solved first. I think I have this chain or in some cases tree of dependancies internally when deciding my next move, particularly when stuck.


Tile needs merging with neighbour but is too small: Merge another neighbour with this one.

Larger tile in the way: Increase the value of a smaller surrounding tile.

etc...


The whole approach will likely be more complicated than this but not much more complicated. It could be this mechanical in feel lacking scores, weights, neurones and deep searches of possibilities. The tree of possibilities rairly even needs to be big enough to need any branching at all.

alan2here
  • 3,223
  • 6
  • 37
  • 62
  • 7
    You're describing a local search with heuristics. That will get you stuck, so you need to plan ahead for the next moves. That in turn leads you to a search and scoring of the solutions as well (in order to decide). So this is really not different than any other presented solution. – runDOSrun Aug 11 '15 at 09:08