0

Below is an assignment for algorithm class where we are learning about nodes, paths, etc. The code is designed to check if one node is reachable from another. The code below works but I am unclear why. G is the "graph" containing each node as a key with values = the nodes it connects to. The mark_component function below returns a number of nodes from a given node.

However in the function check_connection which is designed to return True if two nodes are reachable, it calls this mark_component function and then tests to see if a node is in a dictionary.

What I do not get is check_connection starts with an empty dictionary "marked" and then calls mark_component using this dictionary. Nodes are then added to it. But mark_component returns a number so how does the check_connection function able to "read" what is in marked? As far as that function is concerned, I thought marked would still be empty. I guess I assumed marked was a local variable containing a dictionary but it apparently can be passed to another function and changed.

Can anyone please explain this to me? Many thanks

def mark_component(G, node, marked):
    marked[node] = True
    total_marked = 1
    for neighbor in G[node]:
        if neighbor not in marked:
            total_marked += mark_component(G, neighbor, marked)
    return total_marked

def check_connection(G, v1, v2):
    # Return True if v1 is connected to v2 in G
    # or False if otherwise
    marked = {}
    mark_component(G, v1, marked)
    return 'a' in marked       


G = {'a': {'d': 1, 'g': 1}, 'c': {'g': 1}, 'b': {'f': 1},
     'e': {'h': 1, 'f': 1}, 'd': {'a': 1, 'g': 1},
     'g': {'a': 1, 'c': 1, 'd': 1}, 'f': {'b': 1, 'e': 1}, 'h': {'e': 1}}



print check_connection(G,'a', 'b') 
rae1
  • 6,066
  • 4
  • 27
  • 48
Andy
  • 275
  • 2
  • 14
  • 1
    This is explained in great detail elsewhere. See [here](http://stackoverflow.com/questions/986006/python-how-do-i-pass-a-variable-by-reference/986145#986145), for example. – Ricardo Cárdenes Feb 06 '14 at 02:17

3 Answers3

2

Yes, marked itself is a mutable data structure, meaning its contents can be changed even outside of its original scope. When marked is passed to the mark_component function, the latter receives a reference to the marked object, and it is able to update its content by accessing that reference using an indexer (i.e. marked[node]).

The function check_connection however, still has a reference to the marked object in memory stored in the variable marked. When the expression 'a' in marked is executed, marked refers to the object updated by the mark_component function.

This concept is similar to the pointers concept (aka int * pointer) in other languages like C, and C++.

rae1
  • 6,066
  • 4
  • 27
  • 48
  • Thanks @rae1 for some reason I though variables were local within a function which is where my confusion began. – Andy Feb 06 '14 at 03:25
0

I guess this algorithm is not well implemented.

It's a greedy search in a directed graph. I starts from the first node (v1), tried to go as far as we can. Then, it turns to another node. We also mark the visited notes during the procedure.

Well, it does some simple optimization. If the node has been visited, don't do it again. Thus, I won't "exam" a single node twice.

The last line of check_connection() was wrong. We should check whether v2 has been visited. Thus, the code should be

def mark_component(G, node, marked):
    marked[node] = True
    total_marked = 1
    for neighbor in G[node]:
        if neighbor not in marked:
            total_marked += mark_component(G, neighbor, marked)
    return total_marked

def check_connection(G, v1, v2):
    # Return True if v1 is connected to v2 in G
    # or False if otherwise
    marked = {}
    mark_component(G, v1, marked)
    return v2 in marked


G = {'a': {'d': 1, 'g': 1}, 'c': {'g': 1}, 'b': {'f': 1},
          'e': {'h': 1, 'f': 1}, 'd': {'a': 1, 'g': 1},
          'g': {'a': 1, 'c': 1, 'd': 1}, 'f': {'b': 1, 'e': 1}, 'h': {'e': 1}}



print check_connection(G,'a', 'b')

Let go through some python ideas. Python always passes arguments by reference (v.s. pointers in C++). Thus, the caller can see the changes made by callee. Another one is "x in b". "in" is an operator implemented in most of python container. For "marked" (a dictionary), it is like if v2 is one of marked's keys. It is recommended to do that in python actually.

liuyruc
  • 301
  • 1
  • 3
  • 10
0

It sounds to me like you don't recognize that mark_component is being called recursively. After the first call to mark_component it is called again to work its way through the graph. As the others have mentioned it's a pattern you will see often.