What does the pair of keywords, "continue" and "yield" do in Python?

Question

I came across this question when I read another thread of discussion about finding all cycles in graph implementation. Can anyone please explain the use of this pair of keywords in this example? Thanks.

01 def dfs(graph, start, end):
02     fringe = [(start, [])]
03     while fringe:
04         state, path = fringe.pop()
05         if path and state == end:
06             yield path
07             continue
08         for next_state in graph[state]:
09             if next_state in path:
10                 continue
11             fringe.append((next_state, path+[next_state]))

>>> graph = { 1: [2, 3, 5], 2: [1], 3: [1], 4: [2], 5: [2] }
>>> cycles = [[node]+path  for node in graph for path in dfs(graph, node, node)]
>>> len(cycles)
7
>>> cycles
[[1, 5, 2, 1], [1, 3, 1], [1, 2, 1], [2, 1, 5, 2], [2, 1, 2], [3, 1, 3], [5, 2, 1, 5]]

score 4 · Accepted Answer · answered Jul 17 '17 at 08:58

The two keywords are not closely related.

The continue keyword can only occur in the body of a loop (a for or while statement), and causes the flow of control to return the the top of the loop instead of continuing though the rest of the loop body. It's often an alternative to indenting the whole rest of the loop body in an if or else block. This:

while foo():
    if something():
        continue
    bar()
    baz()

Is exactly equivalent to this:

while foo():
    if not something():
        bar()
        baz()  # but note that these lines are more indented in this version!

Another keyword closely related to continue is break, which causes the control flow to exit the loop immediately, rather than going back to the top. Both continue and break can only effect the closest loop, so if you have nested control structures, it can be difficult to break out of them all at once (or continue the outer loop from inside an inner one).

The yield keyword is rather different. Though it often appears in loops, it doesn't have to. Rather, it is only allowed within the body of a function. It changes the function into a "generator function". When a generator function is called, its code doesn't run immediately, instead, a "generator object" is created and returned to the caller. A generator object is a kind of iterator, and can be iterated upon by a for loop (or manually by calling next() on it). Only when the generator object is iterated does the function's code run. Each time a yield statement is reached, the function's execution pauses and the yielded value (or None if no value was specified) will be given as the value of the iteration. (Note that when somebody casually calls something a "generator", they might mean either a generator function or a generator object. It's usually clear which they mean from context.)

Here's some example code that uses a generator to print 1, 2 and 3:

def generator_function():
    yield 1 # because it contains `yield` statements, this is a generator function
    yield 2
    yield 3

generator_object = generator_function() # you can create a variable for the generator object
for value in generator_object: # but often you'd create it on the same line as the loop
    print(value)

Another keyword somewhat similar to yield is return, which also only makes sense in functions. It immediately ends the execution of the function to return the specified value (or None if no value was specified).

The dfs function you show uses both yield and continue one after the other. What that does is first yield a value (stopping the generator function's execution until the next value is requested), and then once execution is resumed, it goes back to the start of the loop.

If you wanted to, you could rewrite the function to avoid either of those (though the resulting function would work a little differently since it's no longer a lazy generator):

def dfs(graph, start, end):
   results = [] # maintain a list of results to avoid yielding
   fringe = [(start, [])]
   while fringe:
       state, path = fringe.pop()
       if path and state == end:
           results.add(path) # don't yield any more, just add the path to the results list
       else: # add an else block instead of using continue
           for next_state in graph[state]:
               if next_state not in path: # reverse the condition instead of continue
                   fringe.append((next_state, path+[next_state]))
    return results # return the results at the end of the function

I'd note that the generator version of the function is probably better in most situations. Using continue instead of more indentation is more of a style choice, and doesn't have much of an impact on the logic or performance of the code (just on how it looks).

How would you implement the cycles part into this dfs function ? — Oscar Dolloway, Apr 18 '18 at 14:15
@OscarDolloway: I don't think I understand your question. The cycle calculation is a separate line in the question calls the `dfs` function repeatedly. You can do the same thing with my version if you want. I suppose you could put that inside a function (even one that inlines the behavior of `dfs`), but that's no what I was trying to do in my answer. — Blckknght, Apr 18 '18 at 19:01

What does the pair of keywords, "continue" and "yield" do in Python?

1 Answers1