2

I'm new to scala. I was reading about closure than I came across the definition of closure saying that:

"The name arises from the act of "closing" the function literal by "capturing" the bindings of its free variables" .

What does this line actually mean? How is closure really helpful? And what are is it's use-cases?

Nrzonline
  • 1,600
  • 2
  • 18
  • 37
Therii
  • 778
  • 1
  • 8
  • 16

2 Answers2

7

What are free variables?

Consider the function literal for curried multiplication:

(x: Int) => (y: Int) => x * y

If you consider only the subexpression e := (y: Int) => x * y, then you can compute the set of free variables in e recursively as follows (starting from leafs x and y, and then building up the expression from its parts):

  • FREE(x) = {x} because x is a variable
  • FREE(y) = {y} because y is a variable
  • FREE(*) = {} because * is a built-in constant, for all practical purposes
  • FREE(x * y) = FREE(*) U FREE(x) U FREE(y) = {} U {x} U {y} = {x, y} by recursive descent into subtrees
  • FREE( (y: Int) => x * y ) = FREE(x * y) - {y} = {x, y} - {y} = {x}, because the variable y is bound by the in-line function definition, and therefore the variable y from FREE(x * y) becomes no longer free.

Thus the variable y is bound and x is free in the expression (y: Int) => x * y.


What are closures?

Now, what happens if you take the whole expression (x: Int) => (y: Int) => x * y and apply it to some constant, 7, say? What is returned?

The returned object is a closure of the expression (y: Int) => x * y, where the free variable x is set to value 7. So, in a sense, the result of

((x: Int) => (y: Int) => x * y)(7)

is something like an object with a single method, that looks roughly like

class y_to_x_times_y_closure(x: Int) {
  def apply(y: Int) = x * y
}

instantiated as y_to_x_times_y_closure(7). Note that the value 7 cannot reside in the call stack of some function, because you can easily produce a whole bunch of closures like

for (i <- 0 to 1000) yield ((x: Int) => (y: Int) => x * y)(i))

that will have to coexist simultaneously in the same thread, with different captured values of i bound to the variable x. Therefore, the closures in Scala are implemented as heap-resident objects.


Why is this useful?

This is actually a way too broad question: it enables you to do functional programming. It's built into the language at such a deep level, that you can't really do anything without using tons of closures everywhere.

Consider the following trivial example: printing the multiplication table of one-digit numbers:

for (x <- 1 to 9) {
  for (y <- 1 to 9) {
    printf("%3d", x * y)
  }
  println()
}

This produces:

  1  2  3  4  5  6  7  8  9
  2  4  6  8 10 12 14 16 18
  3  6  9 12 15 18 21 24 27
  4  8 12 16 20 24 28 32 36
  5 10 15 20 25 30 35 40 45
  6 12 18 24 30 36 42 48 54
  7 14 21 28 35 42 49 56 63
  8 16 24 32 40 48 56 64 72
  9 18 27 36 45 54 63 72 81

Those for-loops are actually desugared into the following method applications on the Range objects 1 to 9:

 (1 to 9) foreach { x => 
   (1 to 9) foreach { y =>
     printf("%3d", x * y)
   }
   println()
 }

What's the y => printf("%3d", x * y)-thing that is passed to the inner foreach? It is actually an expression with the free variable x. In order to print anything meaningful, it has to capture the value of x, which is in turn set to values 1 to 9 by the outer foreach. Thus, the inner foreach works with a closure of y => printf("%3d", x * y), where x is set to values 1 to 9.

Andrey Tyukin
  • 43,673
  • 4
  • 57
  • 93
  • I got a downvote also. Your answer is really neat and explains it gracefully. You deserve an upvote, good man. – nicodp May 13 '18 at 15:07
  • @nicodp Yeah, thx. It seems as if here and in the [linked question](https://stackoverflow.com/questions/12615091/why-is-the-following-scala-function-called-a-closure) there is quite a bit of confusion about what actually constitutes a closure. – Andrey Tyukin May 13 '18 at 15:20
  • would be nice if the answer gives link or explanation what FREE is. – soMuchToLearnAndShare Nov 20 '21 at 11:19
3

This is an open term:

def sumMore(x: Int): Int = x + more

as more appears as a free variable, i.e., its value does not have a binding inside sumMore's body. Hence, you get:

scala> def sumMore(x: Int): Int = x + more
<console>:7: error: not found: value more
       def sumMore(x: Int): Int = x + more
                                      ^

You make this open term a closed one, i.e., the act of closing, by providing a binding for more.

So instead if you have

val more: Int = 5
def sumMore(x: Int): Int = x + more

you'll have a closure as you are closing sumMore by providing a value for more.

You can see this pattern in nested functions:

def fib(n: Int): Int = {
  var num: Int = n
  def innerFib(prev: Int, actual: Int): (Int, Int) = {
    if (num == 0) (prev, prev)
    else if (num == 1) (prev, actual)
    else {
      num -= 1
      innerFib(actual, prev + actual)
    }
  }
  innerFib(0, 1)._2
}

Here, innerFib along with the variable num are a closure, as in the scope of fib's body, num is providing a binding for the free variable num that appears inside innerFib's body.

nicodp
  • 2,362
  • 1
  • 11
  • 20
  • Hm, those seem to be just (nested) methods, I don't see where the instantiation of an actual closure object would be necessary in those examples? – Andrey Tyukin May 12 '18 at 16:14
  • Not necessary, but if you refer to the `fib` example, I forced `num` to be a free name inside `innerFib`'s body, so this can be taken as a closure. – nicodp May 13 '18 at 15:10
  • The problem is that `innerFib` is nested method, not a function literal of the shape `(x1: T1, ..., xn: Tn) => body`. But the question referred explicitly to "function literals". Also, I couldn't quite pinpoint which part of your code would actually require the instantiation of a heap-resident closure object. Note that while closure-like objects can be used to implement nested methods, the closures are, strictly speaking, not required for the compilation of nested methods. Closures are strictly more powerful than nested methods. – Andrey Tyukin May 13 '18 at 15:19
  • 1
    Appel's "Mod Comp Impl", Chapter 6 is somewhat enlightening, quote: *"It is the combination of nested functions (where inner functions may use variables defined in the outer functions) and functions returned as results (or stored into variables) that causes local variables to need lifetimes longer than their enclosing function invocations. Pascal and Tiger have nested functions, but they do not have functions as returnable values. C has functions as returnable values, but not nested functions. So these languages can use stacks to hold local variables."*. So: no closures needed for nested fcts. – Andrey Tyukin May 13 '18 at 15:31
  • 1
    I made an ML implementation of Appel's Tiger :). Good to know that, thank you. I should have used another example (like a high order function as you did) to reflect this. – nicodp May 13 '18 at 15:35
  • So basically, taking my example, `num` will be set as a local variable on the stack of `fib`, that's why this will be always in the scope of `innerFib`, right? – nicodp May 13 '18 at 15:43
  • 1
    It could be that *the current implementation* actually does not keep `num` as local variable on the stack frame of `fib`, but instead indeed creates a closure-object that has `num` as mutable member. But it would be merely an ephemeral implementation detail in the current scala compiler. The nested methods do not require closures per-se: nested method like `innerFib` could also be implemented without heap-resident closure-objects, simply by using stack frames with a slightly more complex layout: the stack frame of the nested `innerFib` would have to hold pointr to stack frame of `fib`. – Andrey Tyukin May 13 '18 at 15:49