2

TL;DR I need to find all possible ways to traverse an undirected cyclic, possibly disconnected graph to find all sets of connected nodes which satisfy certain criteria.

There's a graph of nodes, where each node is assigned a value and edges represent compatible nodes.

Compatible nodes can be put together into a set, but only if each of the nodes is compatible with all of the others. For example given the following graph, represented as adjecency list:

0: 2, 3
1:
2: 0, 4, 6, 7
3: 0, 4, 6, 7
4: 2, 3, 5, 6
5: 4, 7
6: 2, 3, 4, 6
7: 2, 3, 5, 6

Nodes {4, 5} can form a set together as can nodes {3, 4, 6}, while node 1 can only form a single element set.

Now given that each node holds a value, find n sets with the highest combined value. For the example above if the nodes have following values:

0: 4
1: 6
2: 4
3: 5
4: 6
5: 3
6: 3
7: 5

The solution for n = 2 are sets {3, 4, 6} and {2, 7} with combined value of 23.

The approach I started is to use a modified version of DFS where the stack is only populated with the nodes which are compatible with all the nodes in the current set so far. If there are no such nodes, we create a new set and continue with the nodes which have not yet been visited.

Pseudocode:

// As long as we have any unvisited nodes
while(visited.contains(false))
{
    // Take first unvisited node and push it onto a stack
    stack.push(find_first(visited, false));

    // Modified DFS
    while (!stack.empty())
    {
        Node n = stack.pop();

        // Add the node to the set (this will always be either first or compatible node)
        temp_set.add(n);

        // Find all compatible nodes by finding intersection of the adjecency list of the nodes
        // already in the set
        compatibleNodes = find_all_compatible_nodes(set);

        // Add the first compatible, unvisited node we find to the stack
        for(c : compatibleNodes)
        {
            if(!visited[c])
            {
                stack.push(c);
                visited[c] = true;

                break;                
            }
        }

    }

    // Once we run out of compatible nodes for this set, we add it to candidate solution and start with a new one
    candidateSolution.add(temp_set);
    temp_set.clear();
}

// Once we've visited all the nodes, we have a candidate solution where two sets with the highest score are the possible answer

Now this works for a single possible solution, but fails to explore different traversal paths. Basically it builds a tree where each branch represents a set and where all graph nodes are included. I need to find all possible trees which can be built in order to find the one with the highest score.

I think I understand what I need to do next. Starting with the last solution, traverse back to the point where an alternative path could've been taken and build a new solution from there. Then keep doing it until I return to the first node and there are no more nodes to visit. However I can't wrap my head around implementation. Any help appreciated.

jaho
  • 4,852
  • 6
  • 40
  • 66
  • Possible duplicate of [Find all complete sub-graphs within a graph](https://stackoverflow.com/questions/2801138/find-all-complete-sub-graphs-within-a-graph) –  Jan 12 '19 at 23:12
  • 3
    The first part of the problem is the [clique problem](https://en.wikipedia.org/wiki/Clique_problem). The second part is either simply the `n` sets with the largest value or - let `G` be the graph where each of the sets found in the first step is a node and an edge exists between nodes, if they have common elements - the connected component with `n` node from the complement of `G` with the maximal sum of node-values if the sets need to be disjointed. The first part is NP-complete, the second is most likely as well, if sets need to be disjointed. –  Jan 12 '19 at 23:39
  • Thanks @Paul. This is the path I'm currently exploring, looking at Bron–Kerbosch algorithm. Do you think it's a safe assumption to make that the set of highest value cliques is always going to contain the highest value clique in the graph? This would simplify the second part, as I could remove the nodes used in the largest clique from the graph and continue searching until n cliques are found. It makes sense in my head but I can't think of a way of proving it. – jaho Jan 13 '19 at 03:07
  • 1
    Is this example relevant? Graph: `1 <-> 2 <-> 3 <-> 4` (values: [1, 100, 100, 1]). Set of cliques with highest value: `{{1,2}, {3,4}}`. Highest value clique: `{2,3}`. – גלעד ברקן Jan 13 '19 at 03:32
  • @jaho unfortunately not (depends on your input, but at least in general it won't). Consider e.g.: `G = ({100, 99, 2}, {(100, 99), (100, 2)})`. Then the highest value clique in the complement of `G` would be `{99, 2}`, whereas the node with the maximal node is 100. –  Jan 13 '19 at 07:50
  • @jaho see https://cs.stackexchange.com/questions/92583/finding-independent-set-of-size-n-in-an-undirected-graph for some information on the second part of your problem. Regarding the Bron-Kerbosch algorithm note that it lists maximal cliques, whereas you'll need **all** cliques, which will further increase the number of candidates for the second part. –  Jan 13 '19 at 07:59
  • Note that any combination of complete subgraphs totaling at least `n` nodes during a search need not be split into `n` sets since we know a priori that can be done arbitrarily. – גלעד ברקן Jan 19 '19 at 02:29

0 Answers0