Python NumPy vectorization

Question

I'm trying to code what is known as the List Right Heuristic for the unweighted vertex cover problem. The background is as follows:

Vertex Cover Problem: In the vertex cover problem, we are given an undirected graph G = (V, E) where V is the set of vertices and E is the set of Edges. We need to find the smallest set V' which is a subset of V such that V' covers G. A set V' is said to cover a graph G if all the edges in the graph have at least one vertex in V'.

List Right Heuristic: The algorithm is very simple. Given a list of vertices V = [v1, v2, ... vn] where n is the number of vertices in G, vi is said to be a right neighbor of vj if i > j and vi and vj are connected by an edge in the graph G. We initiate a cover C = {} (empty set) and scan V from right to left. At any point, say the current vertex being scanned is u. If u has at least one right neighbor not in C then u is added to c. The entire V is just scanned once.

I'm solving this for multiple graphs (with same vertices but different edges) at once.

I coded the List Right Heuristic in python. I was able to vectorize it to solve multiple graphs at once, but I was unable to vectorize the original for loop. I'm representing the graph using an Adjacency matrix. I was wondering if it can be further vectorized. Here's my code:

def list_right_heuristic(population: np.ndarray, adj_matrix: np.ndarray):
    adj_matrices = np.matlib.repmat(adj_matrix,population.shape[0], 1).reshape((population.shape[0], *adj_matrix.shape))

    for i in range(population.shape[0]):
        # Remove covered vertices from the graph. Delete corresponding edges
        adj_matrices[i, np.outer(population[i], population[i]).astype(bool)] = 0

    vertex_covers = np.zeros(shape=population.shape, dtype=population.dtype)
    for index in range(population.shape[-1] - 1, -1, -1):
        # Get num of intersecting elements (for each row) in right neighbors and vertex_covers
        inclusion_rows = np.sum(((1 - vertex_covers) * adj_matrices[..., index])[..., index + 1:], axis=-1).astype(bool)
        # Only add vertices to cover for rows which have at least one right neighbor not in vertex cover
        vertex_covers[inclusion_rows, index] = 1

    return vertex_covers

I have p graphs that I'm trying to solve simultaneously, where p=population.shape[0]. Each graph has the same vertices but different edges. The population array is a 2D array where each row indicates vertices of the graph G that are already in the cover. I'm only trying to find the vertices which are not in the cover. So for this reason, setting all rows and columns of vertices in cover to 0, i.e., I'm deleting the corresponding edges. The heuristic should theoretically only return vertices not in the cover now. So in the first for loop, I just set the corresponding rows and columns in the adjacency matrix to 0 ( all elements in the rows and columns will be zero). Next I'm going through the 2D array of vertices from right to left and finding number of right neighbors in each row not in vertex_covers. For this I'm first finding the vertices not in cover (1 - vertex_covers) and then multiplying that with corresponding columns in adj_matrices (or rows since adj matrix is symmetric) to get neighbors of that that vertex we're scanning. Then I'm summing all elements to the right of this. If this value is greater than 0 then there's at least one right neighbor not in vertex_covers. Am I doing this correctly for one? And is there any way to vectorize the second for loop ( or the first for that matter) or speed up the code in general? calling this function thousands of times in some other code for large graphs (with 1000+ vertices). Any help would be appreciated.

Shapes of `population` and `adj_matrix` are the same, right? Also, both of them would be square shaped arrays? — Divakar, Jan 07 '16 at 15:02
adj matrix is a square matrix. population need not be. For example, consider two different vertex covers encoded for the graph with the Adjacency matrix of [[0,0,1,0],[0,0,0,1],[1,0,0,1],[0,1,1,0]]. Suppose cover 1 has 2nd and 4th vertex in it, and cover 2 has 2nd and 3rd. Then population would be [[0,1,0,1], [0,1,1,0]]. Although it's easy to make population a square matrix by just adding in more rows with random covers. — Aditya369, Jan 07 '16 at 15:07

score 0 · Answer 1 · edited May 23 '17 at 12:15

You can use np.einsum to perform many complex operations between indices. In your case, the first loop can be performed this way:

adj_matrices[np.einsum('ij, ik->ijk', population, population).astype(bool)] = 0

It took me some time to understand how einsum works. I found this SO question very helpful.

BTW, Your code gave me the following syntax error:

SyntaxError: can use starred expression only as assignment target

and I had to re-write the first line of the function as:

adj_matrices = np.matlib.repmat(adj_matrix,population.shape[0], 
        1).reshape((population.shape[0],) + adj_matrix.shape)

Python NumPy vectorization

1 Answers1