Using a graph-tool efficiently

Question

After a long thought, I finally decided to post this question here. Few days back I started using graph-tool to do various things. I have been using Networkx before that. I have already seen the impressive performance comparision and thought that everything would be simple enough. However, I immediately ran into the speed issue and asked a question related to a particular aspect of it. I got a quick answer that satisfied me. However, now this speed issue is bugging me every now and then and I can't find any documentation about graph-tool that is related to efficiently using it. For example, from the answer to my last question, I came to realize that it is better to add all edges together instead of one by one which is a very important point to note but hasn't been mentioned anywhere! I am now having two more similar issues:

(1) How do I choose a random neighbour of a given node? I can only see the following solution:

nbr = np.random.choice(list(v.all_neighbours()))

since v.all_neighbours() is a generator, I must convert it into the list to choose a random element. This slows down the code but I don't see any better way.

(2) I want to assign a 1d vector (is list okay?) to each of the vertices in the graph and later I am exchanging and modifying them in a particular way. This is simply a property map and I would like to see some documentation about how to use this efficiently. However, I can't locate anything.

(3) I am trying to simulate a triadic closure in some network which itself is changing with time. Thus, at every time step, I need an information about the neighbours of each vertex in the graph. Again, I must create a list (or numpy array):

nbrs = [w for w in v.neighbours()]

which decreases the speed of my code substantially. This means that I am not doing this correctly but I couldn't find any documentation that would tell me how to use neighbours efficiently in graph-tool.

Somehow Networkx programs that I have written for same tasks have completely outperformed the graph-tool codes I simply can't buy this.

This list might increase and hence I would be very glad if somebody could point me to some documentation about using graph-tool efficiently apart from answering the above mentioned specific questions.

Thanks in advance.

Giving the accepted answer of the linked question a quick look, it seems you can add edges in one-go with [`add_edges_from`](https://networkx.github.io/documentation/latest/reference/generated/networkx.Graph.add_edges_from.html), if that's one bottleneck. Oops that might not be relevant, as I assumed that `networkx` there. — Divakar, Mar 26 '16 at 19:34
Right. Do you have any other idea about graph-tool? Networkx, contrary to my expectations, is working quite fast. This simply means that I am not using graph-tool correctly. — Peaceful, Mar 26 '16 at 19:40
I don't really have any experience working with `graph-tool`, in fact hearing about it for the first time. But if they claim it to be efficient, I would hope they would have some implementation to match up that functionality of adding all edges in one-go, like in `networkx`. Nevertheless, I got try out that module myself, looks interesting! — Divakar, Mar 26 '16 at 19:45

score 1 · Answer 1 · edited May 23 '17 at 12:34

I'll try to make more "graph-tool-specific" answers:

1) well actually this one is related to python, so you might want to use the solution from this thread using random.shuffle However, if you are going to do this repeatedly, (and if you have enough available memory), I think it might be better to get the adjacency matrix as a scipy sparse matrix then use that matrix:

import graph_tool
from graph_tool.spectral import adjacency
import numpy as np

g = graph_tool.Graph()
g.add_vertex(100)
edges = np.random.randint(0, 100, (500,2))
g.add_edge_list(edges)

mat = adjacency(g).tolil()
rows = mat.rows.copy() # mat.rows = in-neighbours for each vertex
map(np.random.shuffle, rows)
rnd_neighbours = { i:row[0] for i, row in enumerate(rows) if row }

Where rnd_neighbours contains the index of one random neighbour for each vertex of nonzero in-degree.

2) reading the graph-tool documentation regarding PropertyMaps and the detailed version, lists are accepted as python::object or simply object. You can then access them as elements in the PropertyMap.

3) For triadic closure, look at the clustering module and at this thread on the mailing list.

EDIT: by the way, I forgot to mention it, but you can access and change the number of OpenMP threads with openmp_enabled, openmp_get_num_threads, and openmp_set_num_threads in graph-tool. Can't find it in the doc, though... But here is the source.

Tiago Peixoto · Accepted Answer · 2017-11-13T11:53:42.903

1

You can access neighbours and vertices as arrays, which will speedup your code, as described in the documentation: https://graph-tool.skewed.de/static/doc/quickstart.html#fast-iteration-over-vertices-and-edges

For example, instead of doing:

nbr = np.random.choice(list(v.out_neighbours()))

you should do:

nbr = np.random.choice(g.get_out_neighbours(v))

which should be substantially faster, as arrays are used instead of lists.

edited Nov 13 '17 at 11:53

answered Nov 03 '17 at 01:12

Tiago Peixoto

5,149
2
28
28

This throws an error: `AttributeError: 'Vertex' object has no attribute 'get_all_neighbours'` – Peaceful Nov 11 '17 at 08:13
@Peaceful you are probably using an old version. You need to upgrade. – Tiago Peixoto Nov 11 '17 at 08:14
I am using the version 2.26 which I upgraded to today. I am on Ubuntu 16.04. – Peaceful Nov 11 '17 at 08:49
Sorry, there was a typo in the answer. I fixed it now. – Tiago Peixoto Nov 11 '17 at 11:00
Still throws the same error. Is it because I am on Ubuntu 16.04 instead of the latest OS version? – Peaceful Nov 13 '17 at 05:00
Sorry, there was _still_ another typo. Fixed now. – Tiago Peixoto Nov 13 '17 at 11:54

Using a graph-tool efficiently

2 Answers2