The mean distance of the nodes increases by log of the number of nodes python

Question

I want to show in a random graph the average distance of nodes increases by log N where N is the number of nodes. p here is the probability of having an edge between a pair of nodes. What I tried is this:

import matplotlib.pyplot as plt
import networkx as nx
%matplotlib inline

def run_experiment(ns, p=0.45, iters=30):
    """
    p: probability of having an edge between a pair of nodes
    ns: sequence of `number of nodes(n)` to try
    iters: number of times to run for each `n`
    
    """
    G = {}
    shortestPath={}
    res = []
    for n in ns:
        print(n)
        for i in range(iters):
            G[i] = nx.gnp_random_graph(n, p)
            shortestPath[i] = nx.average_shortest_path_length(G[i], weight=None, method=None)
        means = array([shortestPath[k] for k in shortestPath]).mean()
        print(means)
        res.append(means)
    return np.array(res)

I tried it for some n values:

ns = [1,10, 100, 1000]
res = run_experiment(ns)

Then I plot it to show the logarithmic curve, but I've got this:

dictionary = dict(zip(ns, res))
plt.plot(list(dictionary.keys()),list(dictionary.values()))
plt.yscale('log')
plt.xscale('log')

What is the problem?

GrimTrigger · Accepted Answer · 2020-12-06T11:01:22.447

It seems that the problem lies in the generation of the random graph (see create a random graph with at least one edge connected). Adding the random generated graph via the function proposed (see code below) results in (the graph actually has an additional ns-entry for 3000):

which seems to confirm your result. Here the code:

from itertools import combinations, groupby
import random
from numpy import array
import matplotlib.pyplot as plt
import networkx as nx
%matplotlib inline

# from https://stackoverflow.com/questions/61958360/how-to-create-random-graph-where-each-node-has-at-least-1-edge-using-networkx
def gnp_random_connected_graph(n, p):
    """
    Generates a random undirected graph, similarly to an Erdős-Rényi 
    graph, but enforcing that the resulting graph is conneted
    """
    edges = combinations(range(n), 2)
    G = nx.Graph()
    G.add_nodes_from(range(n))
    if p <= 0:
        return G
    if p >= 1:
        return nx.complete_graph(n, create_using=G)
    for _, node_edges in groupby(edges, key=lambda x: x[0]):
        node_edges = list(node_edges)
        random_edge = random.choice(node_edges)
        G.add_edge(*random_edge)
        for e in node_edges:
            if random.random() < p:
                G.add_edge(*e)
    return G

def run_experiment(ns, p=0.45, iters=30):
    """
    p: probability of having an edge between a pair of nodes
    ns: sequence of `number of nodes(n)` to try
    iters: number of times to run for each `n`
    
    """
    G = {}
    shortestPath={}
    res = []
    for n in ns:
        print(n)
        for i in range(iters):
            #G[i] = nx.gnp_random_graph(n, p)
            G[i] = gnp_random_connected_graph(n, p)
            shortestPath[i] = nx.average_shortest_path_length(G[i], weight=None, method=None)
        means = array([shortestPath[k] for k in shortestPath]).mean()
        print(means)
        res.append(means)
    return np.array(res)

ns = [1,10, 100, 1000]
res = run_experiment(ns)

dictionary = dict(zip(ns, res))
plt.plot(list(dictionary.keys()),list(dictionary.values()))
plt.yscale('log')
plt.xscale('log')

Note that I have just changed one line in your code and referenced the generating function.

The mean distance of the nodes increases by log of the number of nodes python

1 Answers1