Finding a closed path from list of start and end nodes

Question

I have a list of edges (E) of a graph with nodes V = [1,2,3,4,5,6]:

E = [(1,2), (1,5), (2,3), (3,1), (5,6), (6,1)]

where each tuple (a,b) refers to the start & end node of the edge respectively.

If I know the edges form a closed path in graph G, can I recover the path?

Note that E is not the set of all edges of the graph. Its just a set of edges.

In this example, the path would be 1->2->3->1->5->6->1

A naive approach, I can think of is using a tree where I start with a node, say 1, then I look at all tuples that start with 1, here, (1,2) and (1,5). Then I have two branches, and with nodes as 2 & 5, I continue the process till I end at the starting node at a branch.

How to code this efficiently in python?

Are you looking for a cycle in the graph? Also, do these represent directed or undirected edges? — BrokenBenchmark, Jan 23 '22 at 16:10
@BrokenBenchmark Yes, The path would be a cycle. I am not "looking" for cycles in a graph. I just have the set E & I want to extract the path from it. Note that E is NOT the set of all edges of the graph. The graph is directed. — dev, Jan 23 '22 at 16:12
Could the list represent 1->5->6->1->2->3->1? Why or why not? Do you just need one possible closed path? — BrokenBenchmark, Jan 23 '22 at 16:14
@BrokenBenchmark Yes. Note that these are necessarily the same cycle. Any one suffices. — dev, Jan 23 '22 at 16:15
I don't know for sure. I just see that module recently and not play with it. — David, Jan 23 '22 at 16:32

BrokenBenchmark · Answer 1 · 2022-01-23T16:47:59.317

0

You're looking for a directed Eulerian circuit in your (sub)graph. An Eulerian circuit is a trail that visits every edge exactly once.

The networkx package has a function that can generate the desired circuit for you in linear time:

import networkx as nx

edges = [(1,2), (1,5), (2,3), (3,1), (5,6), (6,1)]
G = nx.MultiDiGraph()
G.add_edges_from(edges)

# Prints [(1, 5), (5, 6), (6, 1), (1, 2), (2, 3), (3, 1)]
# which matches the desired output (as asked in the comments).
print([edge for edge in nx.algorithms.euler.eulerian_circuit(G)])

The documentation cites a 1973 paper, if you're interested in understanding how the algorithm works. You can also take a look at the source code here. Note that we're working with multigraphs here, since you can have multiple edges that have the same source and destination node. There are probably other implementations floating around on the Internet, but they may or may not work for multigraphs.

edited Jan 23 '22 at 16:47

answered Jan 23 '22 at 16:42

BrokenBenchmark

18,126
7
21
33

This works only if the (sub)graph is Eulerian. Consider E = [(5,1),(1,5),(5,2),(2,5),(5.1),(1,4),(4,5)] . Here (5,1) is repeated. – dev Jan 23 '22 at 17:32
You said that you knew the edges formed a closed path (or really, a closed walk). If we treat repeated edges as separate (i.e. making the graph a multigraph), then this isn't an issue. – BrokenBenchmark Jan 23 '22 at 17:36
What exactly do you mean by separate? The example I gave above is indeed closed. How do I deal with that? – dev Jan 23 '22 at 17:49
In your example, (5, 1) appears twice. We construct a graph that contains two edges whose source is node 5 and whose destination is node 1. This is why the code snippet uses a multigraph. – BrokenBenchmark Jan 23 '22 at 17:52

score 0 · Accepted Answer · answered Jan 23 '22 at 18:13

The networkx package has a function that can generate the desired circuit for you in linear time...

It is possible, that construction of nx.MultiDiGraph() is slower and not such efficient, as desired in question, or usage of external packages for only one function is rather excessive. If it is so, there is another way.

Plan: firstly we will find some way from start_node to start_node, then we will insert all loops, that were not visited yet.

from itertools import chain
from collections import defaultdict, deque
from typing import Tuple, List, Iterable, Iterator, DefaultDict, Deque


def retrieve_closed_path(arcs: List[Tuple[int, int]], start_node: int = 1) -> Iterator[int]:
    if not arcs:
        return iter([])

    # for each node `u` carries queue of its
    # neighbours to be visited from node `u`
    d: DefaultDict[int, Deque[int]] = defaultdict(deque)
    for u, v in arcs:
        # deque pop and append complexity is O(1)
        d[u].append(v)

    def _dfs(node) -> Iterator[int]:
        out: Iterator[int] = iter([])
        # guarantee, that all queues
        # will be emptied at the end
        while d[node]:
            # chain returns an iterator and helps to
            # avoid unnecessary memory reallocations
            out = chain([node], _dfs(d[node].pop()), out)
            # if we return in this loop from recursive call, then
            # `out` already carries some (node, ...) and we need
            # only to insert all other loops which start at `node`
        return out

    return chain(_dfs(start_node), [start_node])


def path_to_string(path: Iterable[int]) -> str:
    return '->'.join(str(x) for x in path)

Examples:

    E = [(1, 2), (2, 1)]
    p = retrieve_closed_path(E, 1)
    print(path_to_string(p))
    >> 1->2->1

    E = [(1, 2), (1, 5), (2, 3), (3, 1), (5, 6), (6, 1)]
    p = retrieve_closed_path(E, 1)
    print(path_to_string(p))

    >> 1->5->6->1->2->3->1

    E = [(1, 2), (2, 3), (3, 4), (4, 2), (2, 1)]
    p = retrieve_closed_path(E, 1)
    print(path_to_string(p))

    >> 1->2->3->4->2->1


    E = [(5, 1), (1, 5), (5, 2), (2, 5), (5, 1), (1, 4), (4, 5)]
    p = retrieve_closed_path(E, 1)
    print(path_to_string())
    >> 1->4->5->1->5->2->5->1

Finding a closed path from list of start and end nodes

2 Answers2