1

I am working on the representation of flight routes between different airports and wish to represent them using a networkx plot.

The input data is a dataframe, example:

from, to, airline, trip_number
Paris, New York, Air France, AF001
Paris, Munich, Air France, AF002
Paris, New York, Air France, AF003
Toronto, Paris, Air Canada, AC001
Toronto, Munich, Air Canada, AC002
Munich, New York, Lufthansa, LF001
Franfort, Los Angeles, Lufthansa, LF002
Francfort, Paris, Lufthansa, LF003
Paris, Francfort, Lufthansa, LF004
Paris, Francfort, Air France, AF004
Paris, Francfort, Air Berlin, AB001

I manage to get a network representation, but i am missing two items:

  • label of each node
  • increase the size of the connecting line when the number of trips increase (example: small for Toronto Paris as there is 1 trip, wide for Paris Francfort as there are 3 trips

Current minimum code, df being the dataframe:

import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt
from nxviz.plots import CircosPlot

G = nx.from_pandas_edgelist(df, 'from', 'to')
nx.draw(G, node_size=5, node_color='red')
plt.show()

Thanks for the hand

1 Answers1

1

The following should do the work:

import pandas as pd
from io import StringIO
import networkx as nx
import matplotlib.pyplot as plt

data = ('from, to, airline, trip_number\n'
        'Paris, New York, Air France, AF001\n'
        'Paris, Munich, Air France, AF002\n'
        'Paris, New York, Air France, AF003\n'
        'Toronto, Paris, Air Canada, AC001\n'
        'Toronto, Munich, Air Canada, AC002\n'
        'Munich, New York, Lufthansa, LF001\n'
        'Franfort, Los Angeles, Lufthansa, LF002\n'
        'Francfort, Paris, Lufthansa, LF003\n'
        'Paris, Francfort, Lufthansa, LF004\n'
        'Paris, Francfort, Air France, AF004\n'
        'Paris, Francfort, Air Berlin, AB001')

df = pd.read_csv(StringIO(data), sep=", ")

# see https://stackoverflow.com/a/10374456
short_df = pd.DataFrame({'count': df.groupby(["from", "to"]).size()}).reset_index()

G = nx.from_pandas_edgelist(short_df, source='from', target='to', edge_attr="count")

# edge size, see https://stackoverflow.com/a/25651827
weights = [G[u][v]['count'] for u,v in G.edges()]

nx.draw(G, node_size=5, node_color='red', with_labels=True, width=weights)
plt.show()

Explanation

You first need to retrieve the count of flights, which pandas can easily do. With the answer from here, I create a new data frame with only three columns ("from", "to", "count"). Afterwards, you need to include the edge attribute while creating the graph, i.e., add edge_attr="count". Then, I followed this answer to control the edge width.

Lastly, adding the labels to the plot was the with_labels=True parameter in draw. You can use all the parameters of draw_networkx.

Sparky05
  • 4,692
  • 1
  • 10
  • 27