4

Is there a fast off the shelf implementation of maximum cardinality bipartite matching in C or Python?

I tried networkx, but it is very slow. I have a two-layer graph with about 1000 nodes in each layer. The density varies. What is the time I can expect for this setting?

I see this post Fast max-flow min-cut library for Python, but is there anything faster?

fuglede
  • 17,388
  • 2
  • 54
  • 99
ivan
  • 311
  • 1
  • 4
  • 13

2 Answers2

4

SciPy, as of version 1.4.0, contains an implementation of Hopcroft--Karp in scipy.sparse.csgraph.maximum_bipartite_matching that compares favorably to NetworkX, performance-wise. The function exists in previous versions as well but then assumes a perfect matching to; this assumption is lifted in 1.4.0.

Exactly how well it does will depend on the structure of the bipartite graph, but just by taking random graphs (and ignoring the time it will take NetworkX to initialize the underlying data structures), I get around 200x performance improvements:

import networkx as nx
from scipy.sparse import rand
from scipy.sparse.csgraph import maximum_bipartite_matching


n = 5000
graph = rand(n, n, density=.1, format='csr', random_state=42)
G = nx.algorithms.bipartite.from_biadjacency_matrix(graph)
>>> %timeit maximum_bipartite_matching(graph, perm_type='column')
8.95 ms ± 183 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> %timeit nx.algorithms.bipartite.maximum_matching(G, top_nodes=range(n))
2.01 s ± 118 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
fuglede
  • 17,388
  • 2
  • 54
  • 99
1

Well, if you intend to use the network flow approach, all the algorithms available seem to have a factor of at least O(|V||E|) in their time complexity, or even more (e.g. O(|V|^2|E|) ) in most cases.

If you have a graph with 2000 nodes, even if the number of edges |E| is linear to the number of vertices, |V|, an algorithm with the time complexity O(|V|^2|E|) would result in an execution time as long as a couple of minutes in an average everyday computer. If the graph is dense, and |E| is linear to |V|^2, then it would possibly take days to execute.

An alternative algorithm to solve this bipartite maximal matching problem may be Hopcroft-Karp algorithm. It starts by having an empty set M for a bipartite matching, and tries expanding M by finding augmenting paths in the given graph. The algorithm has O(|E|√|V|) complexity, which is better than the network-flow flavored algorithms like Push Relabel or Edmonds-Karp.

Also, there already is a Python library implementing Hopcroft-Karp algorithm, which I believe was another thing you were looking for.

ilim
  • 4,477
  • 7
  • 27
  • 46
  • 1
    NetworkX also has an implementation of Hopcroft--Karp in [`networkx.algorithms.bipartite.matching.hopcroft_karp_matching`](https://networkx.github.io/documentation/stable/reference/algorithms/generated/networkx.algorithms.bipartite.matching.hopcroft_karp_matching.html) (and has had that one since 2015). – fuglede Nov 09 '19 at 20:29