I have looked at some common tools like Heapy to measure how much memory is being utilized by each traversal technique but I don't know if they are giving me the right results. Here is some code to give the context.
The code simply measures the number of unique nodes in a graph. Two traversal techniques provided viz. count_bfs
and count_dfs
import sys
from guppy import hpy
class Graph:
def __init__(self, key):
self.key = key #unique id for a vertex
self.connections = []
self.visited = False
def count_bfs(start):
parents = [start]
children = []
count = 0
while parents:
for ind in parents:
if not ind.visited:
count += 1
ind.visited = True
for child in ind.connections:
children.append(child)
parents = children
children = []
return count
def count_dfs(start):
if not start.visited:
start.visited = True
else:
return 0
n = 1
for connection in start.connections:
n += count_dfs(connection)
return n
def construct(file, s=1):
"""Constructs a Graph using the adjacency matrix given in the file
:param file: path to the file with the matrix
:param s: starting node key. Defaults to 1
:return start vertex of the graph
"""
d = {}
f = open(file,'rU')
size = int(f.readline())
for x in xrange(1,size+1):
d[x] = Graph(x)
start = d[s]
for i in xrange(0,size):
l = map(lambda x: int(x), f.readline().split())
node = l[0]
for child in l[1:]:
d[node].connections.append(d[child])
return start
if __name__ == "__main__":
s = construct(sys.argv[1])
#h = hpy()
print(count_bfs(s))
#print h.heap()
s = construct(sys.argv[1])
#h = hpy()
print(count_dfs(s))
#print h.heap()
I want to know by what factor is the total memory utilization different in the two traversal techniques viz. count_dfs
and count_bfs
? One might have the intuition that dfs
may be expensive as a new stack is created for every function call. How can the total memory allocations in each traversal technique be measured?
Do the (commented) hpy
statements give the desired measure?
Sample file with connections:
4
1 2 3
2 1 3
3 4
4