I want to use pandas to read a csv file that contains nodes and their attributes. Not all nodes have every attribute, and missing attributes are simply missing from the csv file. When pandas reads the csv file, the missing values appear as nan
. I want to add the nodes in bulk from the dataframe, but avoid adding attributes that are nan
.
For example, here is a sample csv file called mwe.csv
:
Name,Cost,Depth,Class,Mean,SD,CST,SL,Time
Manuf_0001,39.00,1,Manuf,,,12,,10.00
Manuf_0002,36.00,1,Manuf,,,8,,10.00
Part_0001,12.00,2,Part,,,,,28.00
Part_0002,5.00,2,Part,,,,,15.00
Part_0003,9.00,2,Part,,,,,10.00
Retail_0001,0.00,0,Retail,253,36.62,0,0.95,0.00
Retail_0002,0.00,0,Retail,45,1,0,0.95,0.00
Retail_0003,0.00,0,Retail,75,2,0,0.95,0.00
Here's how I'm currently handling this:
import pandas as pd
import numpy as np
import networkx as nx
node_df = pd.read_csv('mwe.csv')
graph = nx.DiGraph()
graph.add_nodes_from(node_df['Name'])
nx.set_node_attributes(graph, dict(zip(node_df['Name'], node_df['Cost'])), 'nodeCost')
nx.set_node_attributes(graph, dict(zip(node_df['Name'], node_df['Mean'])), 'avgDemand')
nx.set_node_attributes(graph, dict(zip(node_df['Name'], node_df['SD'])), 'sdDemand')
nx.set_node_attributes(graph, dict(zip(node_df['Name'], node_df['CST'])), 'servTime')
nx.set_node_attributes(graph, dict(zip(node_df['Name'], node_df['SL'])), 'servLevel')
# Loop through all nodes and all attributes and remove NaNs.
for i in graph.nodes:
for k, v in list(graph.nodes[i].items()):
if np.isnan(v):
del graph.nodes[i][k]
It works, but it's clunky. Is there a better way, e.g., a way to avoid the nan
s when adding the nodes, rather than deleting the nan
s afterwards?