TL;DR: It is ten times faster to generate a list of static networks than it is to merge these static networks into a single dynamic network. Why is this so?
Following this answer, I attempt to generate a random dynamic graph using NetworkX and DyNetx.
The issue arises when dealing with mid-scale networks (approximately 1000 nodes and 1000 timestamps) - memory crashes. Also on a smaller scale (about 100 nodes and 300 timestamps), the process is extremely slow. I believe I've identified the impediment, but I'm not sure how to deal with it.
The following is a simple example of code that generates a random temporal network:
import dynetx as dnx
import networkx as nx
import itertools
from random import random
def dynamic_random_graph(n, steps, up_rate, seed=42):
# Create list of static graphs
list_of_snapshots = list()
for t in range(0, steps):
G_t = nx.Graph()
edges = itertools.combinations(range(n), 2)
G_t.add_nodes_from(range(n))
for e in edges:
if random() < up_rate:
G_t.add_edge(*e)
list_of_snapshots.append(G_t)
# Merge the static graphs into dynamic one
dynamic_graph = dnx.DynGraph()
for t, graph in enumerate(list_of_snapshots):
dynamic_graph.add_interactions_from(graph.edges(data=False), t=t)
return dynamic_graph
If we run the following command:
%timeit dynamic_random_graph(300, 100, 0.5) # Memory was crahsed on larger networks.
>> 1 loop, best of 5: 15.1 s per loop
In contrast, if we run the code without the networks merge, we will get significantly better results:
%timeit dynamic_random_graph_without_merge(300, 100, 0.5) # Ignore the merge part in the function
>> 1 loop, best of 5: 15.1 s per loop
We can work on networks with 1000 nodes without memory crash if we run the function without the merge part.
So, I'd like to look at the DyNetx source code and try to figure out what's wrong with the add_interactions_from
method.
The function is short and simple, but I'm curious why it takes so much time and memory, and how I can improve it. What are your thoughts?
This is the source code:
def add_interactions_from(self, ebunch, t=None, e=None):
"""Add all the interaction in ebunch at time t.
Parameters
----------
ebunch : container of interaction
Each interaction given in the container will be added to the
graph. The interaction must be given as as 2-tuples (u,v) or
3-tuples (u,v,d) where d is a dictionary containing interaction
data.
t : appearance snapshot id, mandatory
e : vanishing snapshot id, optional
See Also
--------
add_edge : add a single interaction
Examples
--------
>>> import dynetx as dn
>>> G = dn.DynGraph()
>>> G.add_edges_from([(0,1),(1,2)], t=0)
"""
# set up attribute dict
if t is None:
raise nx.NetworkXError(
"The t argument must be a specified.")
# process ebunch
for ed in ebunch:
self.add_interaction(ed[0], ed[1], t, e)
I suppose the loop at the end is the source of all problems.
Link to the add_interaction
implementation.