I'm working on graph analysis. I want to compute an N by N similarity matrix that contains the Adamic Adar similarity between every two vertices. To give an overview of Adamic Adar let me start with this introduction:
Given the adjacency matrix A
of an undirected graph G
. CN
is the set of all common neighbors of two vertices x
, y
. A common neighbor of two vertices is one where both vertices have an edge/link to, i.e. both vertices will have a 1 for the corresponding common neighbor node in A
. k_n
is the degree of node n
.
Adamic-Adar is defined as the following:
My attempt to compute it is to fetch both rows of the x
and y
nodes from A
and then sum them. Then look for the elements that has 2
as the value and then gets their degrees and apply the equation. However computing that takes really really a long of time. I tried with a graph that contains 1032 vertices and it took a lot of time to compute. It started with 7 minutes and then I cancelled the computations. So my question: is there a better algorithm to compute it?
Here's my code in python:
def aa(graph):
"""
Calculates the Adamic-Adar index.
"""
N = graph.num_vertices()
A = gts.adjacency(graph)
S = np.zeros((N,N))
degrees = get_degrees_dic(graph)
for i in xrange(N):
A_i = A[i]
for j in xrange(N):
if j != i:
A_j = A[j]
intersection = A_i + A_j
common_ns_degs = list()
for index in xrange(N):
if intersection[index] == 2:
cn_deg = degrees[index]
common_ns_degs.append(1.0/np.log10(cn_deg))
S[i,j] = np.sum(common_ns_degs)
return S