Finding the longest path through a DFS Forest in Haskell

Question

I have a graph that I'm trying to find the longest path through but I'm unsure how to go about it. I'm using the standard Graph and Edge types from Data.Graph, and the graph was generated using the buildG function. My initial thought was to use dfs to perform a depth-first search, but this yielded a Forest object that I'm unsure how to manipulate.

If it at all helps (and I'm sorry I can't pretty print this, showForest only prints strings apparently), this is the output of the dfs command on my graph:

[Node {rootLabel = 87, subForest = [Node {rootLabel = 82, subForest = [Node {rootLabel = 70, subForest = []},Node {rootLabel = 83, subForest = [Node {rootLabel = 66, subForest = [Node {rootLabel = 72, subForest = [Node {rootLabel = 88, subForest = []}]}]},Node {rootLabel = 79, subForest = [Node {rootLabel = 69, subForest = [Node {rootLabel = 85, subForest = []},Node {rootLabel = 84, subForest = [Node {rootLabel = 86, subForest = []},Node {rootLabel = 73, subForest = [Node {rootLabel = 81, subForest = []}]}]}]}]}]},Node {rootLabel = 89, subForest = []}]}]}]

I've found a few different functions for finding longest paths through Trees, such as in this answer, but they only work for Trees, not Forests, and I'm unsure if it's even possible to convert between the two.

Thanks!

Can you elaborate on _longest path graph_? Do you mean _2 vertices that have longest distance among all vertex pairs_? — Shersh, Nov 17 '16 at 09:43
@Shersh Sort of; starting from the root, I need the vertex that has the longest distance from the root, and the route of nodes needed to get to it. — Callum Booth, Nov 17 '16 at 14:17
"and I'm sorry I can't pretty print this, `showForest` only prints Strings apparently" -- If you mean [`drawForest`](https://hackage.haskell.org/package/containers-0.5.8.1/docs/Data-Tree.html#v:drawForest), you can use `drawForest . fmap (fmap show)` — duplode, Nov 17 '16 at 14:22
"but they only work for Trees, not Forests, and I'm unsure if it's even possible to convert between the two." -- A [`Forest`](https://hackage.haskell.org/package/containers-0.5.8.1/docs/Data-Tree.html#t:Forest) is just a list of `Tree`s, so if you have a function that works with `Tree`s you can, at a minimum, simply `fmap` it over the `Forest`. — duplode, Nov 17 '16 at 22:34

duplode · Accepted Answer · 2016-11-17T23:40:56.367

As Shersh explains in their answer, depth-first search is not enough to solve your problem (if it was, you could use the generalFold from the answer you linked to to, for instance, reconstruct the longest path in each tree of the forest). One alternative is switching from Data.Graph to fgl, which offers a wide assortment of graph algorithms, including breadth-first search. Note that The Haddock documentation of FGL is rather laconic, and so you will also want to consult the helpful User Guide available though the package homepage. In the demo below, I will use the bft function to get a breadth-first spanning tree of a graph, which should be enough to get you started:

GHCi> import Data.Graph.Inductive.Graph
GHCi> import Data.Graph.Inductive.PatriciaTree 
GHCi> import Data.Graph.Inductive.Query.BFS
GHCi> :{
GHCi| test :: UGr
GHCi| test = mkUGraph [1..7] [(1,3),(1,2),(2,4),(3,5),(2,6),(5,2),(5,7)]
GHCi| :}
GHCi> prettyPrint test
1:()->[((),2),((),3)]
2:()->[((),4),((),6)]
3:()->[((),5)]
4:()->[]
5:()->[((),2),((),7)]
6:()->[]
7:()->[]
GHCi> bft 1 test
[[1],[2,1],[3,1],[4,2,1],[6,2,1],[5,3,1],[7,5,3,1]]

The spanning tree is essentially a list of paths from a node to all reachable nodes. If you, for instance, just want to find one of the paths with maximum length and don't care about draws, all you need is:

GHCi> import Data.Ord
GHCi> import Data.List
GHCi> maximumBy (comparing length) (bft 1 test)
[7,5,3,1]

If you do care about draws, you will need to do some juggling with things like groupBy, but it won't be fundamentally more difficult.

score 2 · Answer 2 · edited May 23 '17 at 10:32

I will first explain what dfs does. As you mentioned in your comments, you use buildG function so I will also use it in my answer.

dfs takes 2 arguments: graph and list of root nodes. And then dfs runs depth-first search starting from each vertex number in list, returning Tree of reachable vertices from those nodes in list. For example:

You have graph of 1 vertex and you want find paths from 1. You will then have list with only one Tree where root label is 1 and subforest is empty.
You have graph of 2 vertices and edge from 1 to 2 and you want find paths from 1. You will then have list with only one Tree where root label is 1 and subforest is list with 2 as root label.
You have graph of 3 vertices and edges (1,2) and (1,3) and you want find paths from 1. You will then have list with only one Tree where root label is 1 and subforest is list with two trees with 2 and 3 as root labels correspondingly.

All these examples (and even some more) are demonstrated in next ghci session:

λ: let g = buildG (1,1) []
λ: dfs g [1]
[Node {rootLabel = 1, subForest = []}]
λ: dfs g [2]
*** Exception: Ix{Int}.index: Index (2) out of range ((1,1))
λ: let g = buildG (1,2) [(1,2)]
λ: dfs g [1]
[Node {rootLabel = 1, subForest = [Node {rootLabel = 2, subForest = []}]}]
λ: dfs g [2]
[Node {rootLabel = 2, subForest = []}]
λ: let g = buildG (1,3) [(1,2), (1,3)]
λ: dfs g [1]
[Node {rootLabel = 1, subForest = [Node {rootLabel = 3, subForest = []},Node {rootLabel = 2, subForest = []}]}]

But, unfortunately, dfs can't help you with your task: finding furthest vertex from given one. In the way algorithm works it is just not capable to solve such problems. Suppose you have graph with 4 vertices and edges:

1 → 2, 2 → 3, 3 → 4, 1 → 4

Vertex that has longest path from 1 is 3 (this path has 2 edges). But dfs will return you different results depending on order of edges you specified.

λ: let g = buildG (1,4) [(1,2), (1,4), (2,3), (3,4)]
λ: dfs g [1]
[Node {rootLabel = 1, subForest = [Node {rootLabel = 4, subForest = []},Node {rootLabel = 2, subForest = [Node {rootLabel = 3, subForest = []}]}]}]
λ: let g = buildG (1,4) [(1,4), (1,2), (2,3), (3,4)]
λ: dfs g [1]
[Node {rootLabel = 1, subForest = [Node {rootLabel = 2, subForest = [Node {rootLabel = 3, subForest = [Node {rootLabel = 4, subForest = []}]}]}]}]

It is not the problem of particular implementation. It is just how dfs works. And you can't use dfs in general case to solve your particular problem. Also, this problem can't be fixed with just rearranging edges in list because you don't know how to rearrange them in general.

What you really need to use is bfs algorithm — breadth-first search. Unfortunately, it is not implemented in Data.Graph library. So you need to implement whole algo from zero or find implementation somewhere (here some dicussions: Traversing a graph breadth first, marking visited nodes in Haskell).

Thanks for this! I kinda made the naive (and false) assumption that because I was working with the depth of branches, I'd need depth-first search not breadth-first. I have used a breadth-first algorithm now (the bft in duplode's answer) and it's working. — Callum Booth, Nov 18 '16 at 02:27

score 0 · Answer 3 · answered Jan 12 '18 at 04:25

It should be noted that longest path in the graph is not the same as longest path in a spanning tree. Graphs can have many spanning trees and these trees do not have to have the same height. The btf code presented by duplode finds correctly the longest path in a spanning tree.

For example, the graph in duplode answer mkUGraph [1..7] [(1,3),(1,2),(2,4),(3,5),(2,6),(5,2),(5,7)] has a longer path then the one suggested [1,3,5,7], that path is [1,3,5,2,6].

An even simpler example of mkUGraph [1,2,3] [(1,2),(1,3),(2,3)] using FGLs btf 1 will yield spanning tree [[1], [3,1], [2,1]] and the longest path is [1,2,3] (and by itself is a spanning tree).

In general, finding longest path is not trivial (NP-hard actually) but it can be done for DAG using a rather straightforward fold of topologically sorted nodes.

See https://github.com/rpeszek/graph-exercise

Finding the longest path through a DFS Forest in Haskell

3 Answers3