Algorithm for solving Flow Free Game

Question

I recently started playing Flow Free Game.

Connect matching colors with pipe to create a flow. Pair all colors, and cover the entire board to solve each puzzle in Flow Free. But watch out, pipes will break if they cross or overlap!

I realized it is just path finding game between given pair of points with conditions that no two paths overlap. I was interested in writing a solution for the game but don't know where to start. I thought of using backtracking but for very large board sizes it will have high time complexity.

Is there any suitable algorithm to solve the game efficiently. Can using heuristics to solve the problem help? Just give me a hint on where to start, I will take it from there.

I observed in most of the boards that usually

For furthest points, you need to follow path along edge.
For point nearest to each other, follow direct path if there is one.

Is this correct observation and can it be used to solve it efficiently?

The key phrase is "most of the boards". Free Flow is a visual modernization of a known NP-Hard (if not NP-Complete) game, whose name(s) I can't think of offhand. Therefore, it is not known nor expected to have efficient solutions in all cases, which is easier to see if you try to solve the bigger ones. — Nuclearman, May 13 '14 at 04:07
No, there is just no efficient solution. A computer can solve any instance of Free Flow, but the time it takes on average grows exponentially with the puzzle size. A rough estimate might say that a 10x10 is something like 30x harder to solve than a 5x5 for example. I see this in practice as I can solve the 5x5 puzzles in under a second, but it generally takes quite a bit longer to solve the 10x10. Meanwhile a 20x20 might be about 1000x harder to solve than a 10x10. There are no 20x20s in Free Flow, but I've solved all the puzzles in that game and some have taken a lot of time. — Nuclearman, May 13 '14 at 04:16
@Nuclearman I think it's a restricted case of multicommodity flow, but that doesn't necessarily mean it's NP-hard. — templatetypedef, May 13 '14 at 04:57
@templatetypedef: It took me a while to find it, but [here](http://stackoverflow.com/questions/20196264/numberlink-flow-game-how-to-spot-np-complete-problems) is where I first came across it. A bit of research online seems to confirms it. The game I was thinking of was NumberLink. I recalled it was by a well known Japanese game publisher, [Nikoli](http://www.nikoli.com/en/) and found the name that way. — Nuclearman, May 13 '14 at 05:54
You can try a simulation:http://stackoverflow.com/questions/18511095/logic-to-strategically-place-items-in-a-container-with-minimum-overlapping-conne/18901490#18901490. — Micromega, May 13 '14 at 08:39
@Nuclearman Well, the problem is certainly in NP (see my answer). The question is whether it is NP-hard and thus NP-complete. — ziggystar, May 13 '14 at 09:17
@ziggystar: Searching for "SAT formulation of numberlink" and "NP-completeness of numberlink" leads to a couple references. Unsurprisingly, the two most interesting ones are in Japanese. The [first](http://www.ieice.org/ken/paper/20100312tawu/eng/) is the actual paper proof of NP-completeness. The [second](http://bach.istc.kobe-u.ac.jp/sugar/#sec-4) describes how to solve NumberLink using the SAT solver, Sugar. — Nuclearman, May 14 '14 at 00:17
My program https://github.com/thomasahle/numberlink solves any puzzle in a tenth of a second, if you want to check out the codee. It basically uses a branch-n-bound search with heuristics. The Github page has more information on the exact algorithm. — Thomas Ahle, Nov 15 '19 at 23:54

score 8 · Answer 1 · edited Jun 20 '20 at 09:12

Reduction to SAT

Basic idea

Reduce the problem to SAT
Use a modern SAT solver to solve the problem
Profit

Complexity

The problem is obviously in NP: If you guess a board constellation, it is easy (poly-time) to check whether it solves the problem.

Whether it is NP-hard (meaning as hard as every other problem in NP, e.g. SAT), is not clear. Surely modern SAT solvers will not care and solve large instances in a breeze anyway (I guess up to 100x100).

Literature on Number Link

Here I just copy Nuclearman's comment to the OP:

Searching for "SAT formulation of numberlink" and "NP-completeness of numberlink" leads to a couple references. Unsurprisingly, the two most interesting ones are in Japanese. The first is the actual paper proof of NP-completeness. The second describes how to solve NumberLink using the SAT solver, Sugar. –

Hint for reduction to SAT

There are several possibilities to encode the problem. I'll give one that I could make up quickly.

Remark

j_random_hacker noted that free-standing cycles are not allowed. The following encoding does allow them. This problem makes the SAT encoding a bit less attractive. The simplest method I could think of to forbid free-standing loops would introduce O(n^2) new variables, where n is the number of tiles on the board (count distance from next sink for each tile) unless one uses log encoding for this, which would bring it down to O(n*log n), possible making the problem harder for the solver.

Variables

One variable per tile, piece type and color. Example if some variable X-Y-T-C is true it encodes that the tile at position X/Y is of type T and has color C. You don't need the empty tile type since this cannot happen in a solution.

Set initial variables

Set the variables for the sink/sources and say no other tile can be sink/source.

Constraints

For every position, exactly one color/piece combination is true (cardinality constraint).
For every variable (position, type, color), the four adjacent tiles have to be compatible (if the color matches).

I might have missed something. But it should be easily fixed.

I found it very satisfactory to reduce problems to SAT. If you are working on the JVM, you can just pull in a SAT solver as a Maven dependency (SAT4J). It's really simple and fast. — ziggystar, May 13 '14 at 08:32
Nice solution! I think you could reduce it to an integer linear program (ILP) in the same manner and just use a standard solver. Constraints are of the type: every source/sink has 1 or 3 neighbors of the same color, every other cell has 2 or 4 neighbors of the same color. Open issue: formally prove that a solution that meets these constraints is valid. Can't be too hard by contradiction. — Vincent van der Weele, May 13 '14 at 08:39
@Heuster While reduction to ILP is certainly possible, I would not recommend doing so. This is not an optimization problem, but merely a satisfiability problem. Using a CSP format might be more convenient to use than SAT, though. The difference between SAT and CSP is that CSP allows variables with more values than two. — ziggystar, May 13 '14 at 08:55
@ziggystar Yeah, you're right, it's not really an optimization problem. — Vincent van der Weele, May 13 '14 at 09:07
Assuming that you have 2 piece types, "pipe" (the tiles you place) and "endpoint" (the tiles present at the start), I think the constraints you're supplying won't rule out the possibility of cycles (loops) of "pipe" tiles. You need to add the equivalent of "subtour elimination constraints" for the TSP. — j_random_hacker, May 13 '14 at 11:33

j_random_hacker · Answer 2 · 2014-05-13T15:23:54.940

I suspect that no polynomial-time algorithm is guaranteed to solve every instance of this problem. But since one of the requirements is that every square must be covered by pipe, a similar approach to what both people and computers use for solving Sudoku should work well here:

For every empty square, form a set of possible colours for that square, and then repeatedly perform logical deductions at each square to shrink the allowed set of colours for that square.
Whenever a square's set of possible colours shrinks to size 1, the colour for that square is determined.
If we reach a state where no more logical deductions can be performed and the puzzle is not completely solved yet (i.e. there is at least one square with more than one possible colour), pick one of these undecided squares and recurse on it, trying each of the possible colours in turn. Each try will either lead to a solution, or a contradiction; the latter eliminates that colour as a possibility for that square.

When picking a square to branch on, it's generally a good idea to pick a square with as few allowed colours as possible.

[EDIT: It's important to avoid the possibility of forming invalid "loops" of pipe. One way to do this is by maintaining, for each allowed colour i of each square x, 2 bits of information: whether the square x is connected by a path of definite i-coloured tiles to the first i-coloured endpoint, and the same thing for the second i-coloured endpoint. Then when recursing, don't ever pick a square that has two neighbours with the same bit set (or with neither bit set) for any allowed colour.]

You actually don't need to use any logical deductions at all, but the more and better deductions you use, the faster the program will run as they will (possibly dramatically) reduce the amount of recursion. Some useful deductions include:

If a square is the only possible way to extend the path for some particular colour, then it must be assigned that colour.
If a square has colour i in its set of allowed colours, but it does not have at least 2 neighbouring squares that also have colour i in their sets of allowed colours, then it can't be "reached" by any path of colour i, and colour i can be eliminated as a possibility.

More advanced deductions based on path connectivity might help further -- e.g. if you can determine that every path connecting some pair of connectors must pass through a particular square, you can immediately assign that colour to the square.

This simple approach infers a complete solution without any recursion in your 5x5 example: the squares at (5, 2), (5, 3), (4, 3) and (4, 4) are forced to be orange; (4, 5) is forced to be green; (5, 5) is also forced to be green by virtue of the fact that no other colour could get to this square and then back again; now the orange path ending at (4, 4) has nowhere to go except to complete the orange path at (3, 4). Also (3, 1) is forced to be red; (3, 2) is forced to be yellow, which in turn forces (2, 1) and then (2, 2) to be red, which finally forces the yellow path to finish at (3, 3). The red pipe at (2, 2) forces (1, 2) to be blue, and the red and blue paths wind up being completely determined, "forcing each other" as they go.

If you use a SAT solver, it will do all what is described in this answer, and more (things you don't even think about). — ziggystar, May 13 '14 at 11:14
@ziggystar: It might be hard to tell a SAT solver about the need to avoid loops of pipe. For eliminating subtours from the TSP this is usually handled by generating these constraints on-the-fly if and when they are found to be necessary, since there are an exponential number of them, so specifying them all at the start would be a non-starter. (That uses an ILP solver, but it's a similar situation.) — j_random_hacker, May 13 '14 at 11:37
So, a) are loops possible in the game at all? b) Are they disallowed? This is no shortest path contest. It's *fit in all the pieces*. If you simply may not have loops/crossings then don't allow the corresponding pieces. — ziggystar, May 13 '14 at 11:41
@ziggystar I like SAT solvers too, but j_random_hacker wants to propagate a more complicated constraint than can be readily expressed with the solvers that I've used. — David Eisenstat, May 13 '14 at 13:33
@DavidEisenstat I agree that you can potentially do more propagation than what SAT solvers do (people found out that anything more complicated than unit-propagation does not pay), and in particular you can do this in a domain-dependent way as suggested. But it's not simple, and possibly not faster. I think what I basically want to say is this: Reduction to SAT is easy and the first thing I would try. In addition it is unclear whether you can implement something faster in a reasonable amount of time. But if you have fun programming things, of course go on. — ziggystar, May 13 '14 at 13:40
@ziggystar: In the online version of the game I found, you can't create free-standing loops. Not sure what you mean by "then don't allow the corresponding pieces" -- how would you do that? — j_random_hacker, May 13 '14 at 15:25
Ah, I was missing the *free-standing*. Indeed, this makes the SAT encoding *much more* complicated. I feel the urge to implement it. — ziggystar, May 13 '14 at 16:04

score 4 · Answer 3 · answered Dec 04 '16 at 02:30

I found a blog post on Needlessly Complex that completely explains how to use SAT to solve this problem.

The code is open-source as well, so you can look at it (and understand it) in action.

I'll provide a quote from it here that describes the rules you need to implement in SAT:

Every cell is assigned a single color.
The color of every endpoint cell is known and specified.
Every endpoint cell has exactly one neighbor which matches its color.
The flow through every non-endpoint cell matches exactly one of the six direction types.
The neighbors of a cell specified by its direction type must match its color.
The neighbors of a cell not specified by its direction type must not match its color.

Thank you @Matt Zucker for creating this!

Wayback Machine link in case the old one goes bad: https://web.archive.org/web/20180607162803/https://mzucker.github.io/2016/09/02/eating-sat-flavored-crow.html — Pro Q, Jun 07 '18 at 16:29

score 0 · Answer 4 · answered Jan 14 '16 at 23:13

I like solutions that are similar to human thinking. You can (pretty easily) get the answer of a Sudoku by brute force, but it's more useful to get a path you could have followed to solve the puzzle.

I observed in most of the boards that usually 1.For furthest points, you need to follow path along edge. 2.For point nearest to each other, follow direct path if there is one. Is this correct observation and can it be used to solve it efficiently?

These are true "most of the times", but not always.

I would replace your first rule by this one : if both sinks are along edge, you need to follow path along edge. (You could build a counter-example, but it's true most of the times). After you make a path along the edge, the blocks along the edge should be considered part of the edge, so your algorithm will try to follow the new edge made by the previous pipe. I hope this sentence makes sense...

Of course, before using those "most of the times" rules, you need to follow absolutes rules (see the two deductions from j_random_hacker's post).

Another thing is to try to eliminate boards that can't lead to a solution. Let's call an unfinished pipe (one that starts from a sink but does not yet reach the other sink) a snake, and the last square of the unfinished pipe will be called the snake's head. If you can't find a path of blank squares between the two heads of the same color, it means your board can't lead to a solution and should be discarded (or you need to backtrack, depending of your implementation).

The free flow game (and other similar games) accept as a valid solution a board where there are two lines of the same color side-by-side, but I believe that there always exists a solution without side-by-side lines. That would mean that any square that is not a sink would have exactly two neighbors of the same color, and sinks would have exactly one. If the rule happens to be always true (I believe it is, but can't prove it), that would be an additional constraint to decrease your number of possibilities. I solved some of Free Flow's puzzles using side-by-side lines, but most of the times I found another solution without them. I haven't seen side-by-side lines on Free Flow's solutions web site.

alan2here · Answer 5 · 2018-04-04T15:09:51.730

A few rules that lead to a sort of algorithm to solve levels in flow, based on the IOS vertions by Big Duck Games, this company seems to produce the canonical versions. The rest of this answer assumes no walls, bridges or warps.

Even if your uncannily good, the huge 15x18 square boards are a good example of how just going at it in ways that seem likely get you stuck just before the end over and over again and practically having to start again from scratch. This is probably to do with the already mentioned exponential time complexity in the general case. But this doesn’t mean that a simple stratergy isn’t overwhelmingly effective for most boards.

Blocks are never left empty, therefore orphaned blocks mean you’ve done something wrong.
Cardinally neighbouring cells of the same colour must be connected. This rules out 2x2 blocks of the same colour and on the hexagonal grid triangles of 3 neighbouring cells.
You can often make perminent progress by establishing that a color goes or is excluded from a certain square.
Due to points 1 and 2, on the hexagonal grid on boards that are hexagonal in shape a pipe going along an edge is usually stuck going along it all the way round to the exit, effectively moving the outer edge in and making the board smaller so the process can be repeated. It is predictable what sorts of neighbouring conditions guarantee and what sorts can break this cycle for both sorts of grid.

Most if not all 3rd party variants I’ve found lack 1 to 4, but given these restraints generating valid boards may be a difficult task.

Answer:

Point 3 suggests a value be stored for each cell that is able to be either a colour, or a set of false/indeterminate values there being one for each colour.

A solver could repeatedly use points 1 and 2 along with the data stored for point 3 on small neighbourhoods of paths around the ends of pipes to increasingly set colours or set the indeterminate values to false.

score 0 · Answer 6 · answered Oct 22 '20 at 20:37

A few of us have spent quite a bit of time thinking about this. I summarised our work in a Medium article here: https://towardsdatascience.com/deep-learning-vs-puzzle-games-e996feb76162

Spoiler: so far, good old SAT seems to beat fancy AI algorithms!