I have a list of vertex indices for which I need to get the corresponding vertex properties. I can think of doing that by the following code:
[graph.vp["label"][ graph.vertex(i) ] for i in indices]
It works fine, but can I avoid the python loop altogether to achieve better speed?
The reason I'm asking this is that I found this particular code to be much slower than another one written entirely on python data structures. For example, this is what I'm doing:
for t in range(args.num_trials):
for b in budget:
train, test = train_test_split(n, train_size=b, random_state=t)
y_true = [graph.vp["label"][ graph.vertex(t) ] for t in test]
where the "graph" is a graph-tool graph object. On the other hand, Here is another code snippet:
for t in range(args.num_trials):
for b in budget:
train, test = train_test_split(n, train_size=b, random_state=t)
y_true = [graph.node_list[t].label for t in test]
where the graph is a custom defined python class consisting basic python data structures (e.g. node_list is a python list of Node class).
The issue here is, the later code runs much faster than the first one. The first one takes on average around 7 seconds whereas the later one takes only 0.07 seconds in my machine. Everything else is same for the two code snippets except the last line. I found here the author mentioned that,
graph-tool achieves greater performance by off-loading main loops to C++
So, I was wondering how can I off-load the loop in this particular scenario? And what is the explanation for this poor performance by graph-tool?