2

I was messing around with closures in Ruby and came across the following scenario that I can not understand.

def find_child_nodes(node)
  left_node_name  = "#{node}A"
  right_node_name = "#{node}B"
  [left_node_name, right_node_name]
end

# use a stack of closures (lambdas) to try to perform a breadth-first search
actions = []
actions << lambda { {:parent_nodes => ['A'], :child_nodes => find_child_nodes('A') } }

while !actions.empty?
  result = actions.shift.call

  puts result[:parent_nodes].to_s

  result[:child_nodes].each do |child_node|
   parent_nodes = result[:parent_nodes] + [child_node]
   actions << lambda { {:parent_nodes => parent_nodes, :child_nodes => find_child_nodes(child_node) } }
  end
end

The above code returns the following breadth-first search output:

["A"]
["A", "AA"]
["A", "AB"]
["A", "AA", "AAA"]
["A", "AA", "AAB"]
["A", "AB", "ABA"]
["A", "AB", "ABB"]
["A", "AA", "AAA", "AAAA"]
...

So far, so good. But now if I change these two lines

parent_nodes = result[:parent_nodes] + [child_node]
actions << lambda { {:parent_nodes => parent_nodes, :child_nodes => find_child_nodes(child_node) } }

to this one line

actions << lambda { {:parent_nodes => result[:parent_nodes] + [child_node], :child_nodes => find_child_nodes(child_node) } }

My search is no longer breadth-first. Instead I now get

["A"]
["A", "AA"]
["A", "AA", "AB"]
["A", "AA", "AB", "AAA"]
["A", "AA", "AB", "AAA", "AAB"]
...

Could anyone explain exactly what's going on here?

Reck
  • 7,966
  • 2
  • 20
  • 24

2 Answers2

2

The problem in your code boils down to this:

results = [
  {a: [1, 2, 3]}, 
  {a: [4, 5, 6]},
]

funcs = []

while not results.empty?
  result = results.shift

  2.times do |i|
    val = result[:a] + [i]

    #funcs << lambda { p val }
    funcs << lambda { p result[:a] + [i] }
  end
end

funcs.each do |func|
  func.call
end

--output:--
[4, 5, 6, 0]
[4, 5, 6, 1]
[4, 5, 6, 0]
[4, 5, 6, 1]

A closure closes over a variable--not a value. Subsequently, the variable can be changed, and the closure will see the new value when it executes. Here is a very simple example of that:

val = "hello"
func = lambda { puts val }  #This will output 'hello', right?

val = "goodbye"
func.call

--output:--
goodbye

In the lambda line inside the loop here:

results = [
  {a: [1, 2, 3]}, 
  {a: [4, 5, 6]},
]

funcs = []

while not results.empty?
  result = results.shift
    ...
    ...

    funcs << lambda { p result[:a] + [i] }  #<==HERE
  end
end

...the lambda closes over the whole result variable--not just result[:a]. However, the result variable is the same variable every time through the while loop--a new variable is not created each time through the loop.

The same thing happens to the val variable in this code:

results = [
  {a: [1, 2, 3]},
  {a: [4, 5, 6]},
]

funcs = []

while not results.empty?
  result = results.shift
  val = result[:a] + [1]

  funcs << lambda { p val }
end

funcs.each do |func|
  func.call
end

--output:--
[4, 5, 6, 1]
[4, 5, 6, 1]

The val variable is assigned a newly created array each time through the loop, and the new array is completely independent of result and result[:a], yet all the lambdas see the same array. That's because all the lambdas close over the same val variable; then the val variable is subsequently changed.

But if you introduce a block:

while not results.empty?
  result = results.shift

  2.times do |i|
    val = result[:a] + [i]
    funcs << lambda { p val }
  end
end

--output:--
[1, 2, 3, 0]
[1, 2, 3, 1]
[4, 5, 6, 0]
[4, 5, 6, 1]

...every time the block executes, the val variable is created anew. As a result, each lambda closes over a different val variable. That should make some sense if you consider that a block is just a function that gets passed to the method, in this case the times() method. Then the method repeatedly calls the function--and when a function is called, the local variables, like val, are created; and when the function finishes executing, all the local variables are destroyed.

Now back to the original example:

while not results.empty?
  result = results.shift

  2.times do |i|
    val = result[:a] + [i]

    #funcs << lambda { p val }
    funcs << lambda { p result[:a] + [i] }
  end
end

The reason the two lambda lines produce different results should now be clear. The first lambda line closes over a new val variable every time the block executes. But the second lambda line closes over the same result variable every time the block executes, so all the lambdas will refer to the same result variable--and the last hash assigned to the result variable is the hash that all the lambdas see.

So the rule is: loops do not create new variables every time through the loop and blocks do.

Note that it would be better to declare all loop variables outside the loop, lest we forget that the variables inside the loop are not created anew every time through the loop.

7stud
  • 46,922
  • 14
  • 101
  • 127
2

By placing the code inside of the lambda, you are deferring evaluation of result until it is referenced, at which point the value has changed. The closure worked fine when you just referenced parent_nodes because the value of parent_nodes had already been set (i.e. result had been accessed) when the lambda was created and the block in which parent_nodes was defined was not reused.

Note that if you create a separate block each time through the loop and define result in that block, the closure will also work. See Ruby for loop a trap? for a related discussion.

Community
  • 1
  • 1
Peter Alfvin
  • 28,599
  • 8
  • 68
  • 106