Traversing a graph breadth first, marking visited nodes in Haskell

Question

So the question is simple. Given a graph (I hope that the structure of the graph doesn't matter much in this problem), how do I go about doing a BFS on it?

I've recently asked a question about generating a list where each element appends many elements to the end of it. The answer should hopefully let me make a queue I need to do the BFS. But there's another key component that a search needs and it's marking the nodes as visited so we don't go over them again. This also needs to have no overhead on the execution of the algorithm. Neither marking or reading.

Since Haskell doesn't let me change state, how would I go about doing that?

(I'm not looking for a way to translate my imperative code to Haskell. An idiomatic Haskell solution would be great.)

A remember answering a [related question](http://programmers.stackexchange.com/questions/213441/functional-programming-and-stateful-algorithms/213494) about doing a DFS. If you want to be purely functional, the basic idea is that you store the state in a map that is separate from the graph itself and all your functions receive this map as an extra parameter and return the modified version of the map as an extra return value. — hugomg, Jan 12 '14 at 13:06
That sound extremely inefficient. Does the compiler make it work by forgetting the old versions? Also, creating a new map and copying over the old one every step? Sounds very slow. — Luka Horvat, Jan 12 '14 at 13:11
You only copy the parts of the map that change so updates are O(log n) instead of O(N). That said, if you care about that, you can program everything inside the ST or IO monads and use mutable datastructures so your algorithm is totally equivalent to how you would write it in an imperative language. — hugomg, Jan 12 '14 at 13:15

score 7 · Accepted Answer · edited May 23 '17 at 10:29

As mentioned in the comments, a proper approach is to separate your graph from marking its nodes. You need to use some kind of a set container. There are basically two approaches you can take:

Use a functional set. Although at every modification a new version is created, functional data structures are designed so that only a very small part actually changes and most of the original one is shared. Insert/lookup operations take O(log n) for such structures. But HashSet from unordered-containers is optimized for speed and the base of the logarithm is large, so for most purposes the factor is like constant.
Use the ST monad (see also this question). Inside the monad you can use mutable data structures, while the result will be still referentially transparent. Then you can use for example ST-based hash tables from the hashtables package. This allows you to get rid of the log n factor and do operations on hash tables in constant time. But there are other drawbacks, for example that your traversal must be pure. If works within some other monad, you won't be able to interleave it with ST operations.

Some other primitive monad, yes? You can use ST in a transformer stack, but only at the bottom. If IO was supposed to go there then you're stuck. — J. Abrahamson, Jan 12 '14 at 16:52
"base of the logarithm is large, so for most purposes the factor is like constant" Err, no, this is not how O(log n) works. — alternative, Nov 04 '14 at 23:05
@alternative it does. If factor is say *16*, then you need only 8 levels to represent *2 ^ 32* elements (you probably don't want to have more in memory). `log (2 ** 32) / log 16 = 8.0`. Which can be treated **like** constant. — phadej, Jul 17 '15 at 06:27

score 6 · Answer 2 · answered Jan 12 '14 at 18:45

The paper Inductive Graphs and Functional Graph Algorithms discusses such problems and is quite readable. It is the basis of the fgl package, which has an entire module for BFS-like queries. I highly recommend the paper -- it was an eye-opening read for me (namely, the underlying core idea of "give an inductive interface even if your data isn't inductive" was an awesome one) -- and further recommend that if you can use the fgl package you do. But if you can't, the paper will tell you enough to write an algorithm for your custom data type, I'm sure.

Yogesh Sajanikar · Answer 3 · 2014-01-14T02:00:00.420

0

I think, the simplest way to look at this is to look at any level in the tree graph.

At any level, there are n nodes.
The BF traversal at this level will traverse these n nodes, and
BF traversal of children of these n nodes.

This is now easy to translate, (this code flattens the graph breadth first).

toBFSList :: [Graph a] -> [Graph a]
toBFSList [] = [] -- trivial case
toBFSList gs = ns ++ fs 
    where ns = map extract gs -- "extract" may extract a single node
          cs = concat $ map children gs   -- get all the child nodes for all above node
          fs = toBFSList cs   -- for all children, do traversal

edited Jan 14 '14 at 02:00

answered Jan 13 '14 at 09:50

Yogesh Sajanikar

1,086
7
19

1

If your graph has any cycles, this will go on forever. – Luka Horvat Jan 13 '14 at 19:09
@LukaHorvat That is correct. I edited the answer to change graph to tree. – Yogesh Sajanikar Jan 14 '14 at 01:58

Traversing a graph breadth first, marking visited nodes in Haskell

3 Answers3

Linked