Loop over pandas dataframe to create multiple networks

Question

I have data of countries trade with one another. I have split the main file according to months and got 12 csv files for the year 2019. A sample of the data of January csv is provided below:

    reporter    partner year    month      trade
0   Albania Argentina   2019    01         515256
1   Albania Australia   2019    01         398336
2   Albania Austria     2019    01         7664503
3   Albania Bahrain     2019    01         400
4   Albania Bangladesh  2019    01         653907
5   Zimbabwe Zambia     2019    01         79569855

I want to make complex network for every month and print the number of nodes of every network. Now I can do it the hard (stupid) way like so.

df01 = pd.read_csv('012019.csv')
df02 = pd.read_csv('022019.csv')
df03 = pd.read_csv('032019.csv')
df1= df01[['reporter','partner', 'trade']]
df2= df02[['reporter','partner', 'trade']]
df3= df03[['reporter','partner', 'trade']]
G1 = nx.Graph()
G1 = nx.from_pandas_edgelist(df1, 'reporter', 'partner', edge_attr='trade')
G1.number_of_nodes()

and so on for the next networks.

My question is how can I use a "for loop" to read the files, convert them to networks from dataframe and report the number of nodes of each node.

I tried this but nothing is reported.

for f in glob.glob('.csv'):
    df = pd.read_csv(f)
    df1 = df[['reporter','partner', 'trade']]
    G = nx.from_pandas_edgelist(df1, 'reporter', 'partner', edge_attr='trade')
    G.number_of_nodes()

Thanks.

Edit:

Ok. So I managed to do the above using similar codes like below:

for files in glob.glob('/home/user/VMShared/network/2nd/*.csv'):
df = pd.read_csv(files)
df1=df[['reporter','partner', 'import']]
G = nx.Graph()
G = nx.from_pandas_edgelist(df1, 'reporter', 'partner', edge_attr='import')
nx.write_graphml_lxml(G, "/home/user/VMShared/network/2nd/*.graphml")

The problem that I now face is how to write separate files. All I get from this is one file titled *.graphml. How can I get graphml files for every input file? Also if I can get the same graphml output name as the input file would be a plus.

Add an asterisk `*` before `.csv`, so it would be like `for f in glob.glob('*.csv')`. Check [this](https://stackoverflow.com/questions/10377998/how-can-i-iterate-over-files-in-a-given-directory) — Azim Mazinani, Sep 25 '20 at 08:52

Loop over pandas dataframe to create multiple networks

0 Answers0