What are free variables?
Consider the function literal for curried multiplication:
(x: Int) => (y: Int) => x * y
If you consider only the subexpression e := (y: Int) => x * y
, then you can compute the set of free variables in e recursively as follows (starting from leafs x
and y
, and then building up the expression from its parts):
FREE(x) = {x}
because x
is a variable
FREE(y) = {y}
because y
is a variable
FREE(*) = {}
because *
is a built-in constant, for all practical purposes
FREE(x * y) = FREE(*) U FREE(x) U FREE(y) = {} U {x} U {y} = {x, y}
by recursive descent into subtrees
FREE( (y: Int) => x * y ) = FREE(x * y) - {y} = {x, y} - {y} = {x}
, because the variable y
is bound by the in-line function definition, and therefore the variable y
from FREE(x * y)
becomes no longer free.
Thus the variable y
is bound and x
is free in the expression (y: Int) => x * y
.
What are closures?
Now, what happens if you take the whole expression (x: Int) => (y: Int) => x * y
and apply it to some constant, 7
, say? What is returned?
The returned object is a closure of the expression (y: Int) => x * y
, where the free variable x
is set to value 7
. So, in a sense, the result of
((x: Int) => (y: Int) => x * y)(7)
is something like an object with a single method, that looks roughly like
class y_to_x_times_y_closure(x: Int) {
def apply(y: Int) = x * y
}
instantiated as y_to_x_times_y_closure(7)
. Note that the value 7
cannot reside in the call stack of some function, because you can easily produce a whole bunch of closures like
for (i <- 0 to 1000) yield ((x: Int) => (y: Int) => x * y)(i))
that will have to coexist simultaneously in the same thread, with different captured values of i
bound to the variable x
. Therefore, the closures in Scala are implemented as heap-resident objects.
Why is this useful?
This is actually a way too broad question: it enables you to do functional programming. It's built into the language at such a deep level, that you can't really do anything without using tons of closures everywhere.
Consider the following trivial example: printing the multiplication table of one-digit numbers:
for (x <- 1 to 9) {
for (y <- 1 to 9) {
printf("%3d", x * y)
}
println()
}
This produces:
1 2 3 4 5 6 7 8 9
2 4 6 8 10 12 14 16 18
3 6 9 12 15 18 21 24 27
4 8 12 16 20 24 28 32 36
5 10 15 20 25 30 35 40 45
6 12 18 24 30 36 42 48 54
7 14 21 28 35 42 49 56 63
8 16 24 32 40 48 56 64 72
9 18 27 36 45 54 63 72 81
Those for
-loops are actually desugared into the following method applications on the Range
objects 1 to 9
:
(1 to 9) foreach { x =>
(1 to 9) foreach { y =>
printf("%3d", x * y)
}
println()
}
What's the y => printf("%3d", x * y)
-thing that is passed to the inner foreach
? It is actually an expression with the free variable x
. In order to print anything meaningful, it has to capture the value of x
, which is in turn set to values 1
to 9
by the outer foreach
. Thus, the inner foreach
works with a closure of y => printf("%3d", x * y)
, where x
is set to values 1
to 9
.