Why variable used in lambda expression should be final or effectively final

Question

This question has been previously asked over here
My question regarding why which was answered over here
But I have some doubts about the answer. The answer provided mentions-

Although other answers prove the requirement, they don't explain why the requirement exists.

The JLS mentions why in §15.27.2:

The restriction to effectively final variables prohibits access to dynamically-changing local variables, whose capture would likely introduce concurrency problems.

To lower the risk of bugs, they decided to ensure captured variables are never mutated. I am confused by the statement that it would lead to concurrency problems.

I read the article about concurrency problems on Baeldung but still, I am a bit confused about how it will cause concurrency problems, can anybody help me out with an example. Thanks in advance.

Answers below explain why lambdas in Java "can't" refer to the local variables from enclosing scopes. But that exact feature _is_ found in some other programming languages. It's called a _lexical closure._ I don't have time to write a proper answer that would explain closures, but [the Wikipedia article](https://en.wikipedia.org/wiki/Closure_(computer_programming)) is a good place to start if you want to learn more. TLDR version: implementing closures is more work for the language designers, and it invites programmers who don't understand them to make new categories of mistakes. — Solomon Slow, Jun 23 '21 at 13:06

score 3 · Answer 1 · answered Jun 23 '21 at 06:17

When an instance of a lambda expression is created, any variables in the enclosing scope that it refers are copied into it. Now, suppose if that were allowed to modify, and now you are working with a stale value which is there in that copy. On the other hand, suppose the copy is modified inside the lambda, and still the value in the enclosing scope is not updated, leaving an inconsistency. Thus, to prevent such occurrences, the language designers have imposed this restriction. It would probably have made their life easier too. A related answer for an anonymous inner class can be found here.

Another point is that you will be able to pass the lambda expression around and if it is escaped and a different thread executes it, while current thread is updating the same local variable, then there will be some concurrency issues too.

I think this answer speaks to the real issue: programmers will not understand that a copy has been made, and expect the lambda so see changes made after the copy was made, and expect changes made inside the lambda to be made on the original local variable. I argue that there is no technical reason the outer local variable need be effectively final, nor the captured copy. The restriction is in place *only* to prevent mistakes coming from not understanding that a copy of the local variable has been made. — Jesse, May 22 '23 at 06:35

Slaw · Accepted Answer · 2021-06-23T07:21:33.080

I'd like to preface this answer by saying what I show below is not actually how lambdas are implemented. The actual implementation involves java.lang.invoke.LambdaMetafactory if I'm not mistaken. My answer makes use of some inaccuracies to better demonstrate the point.

Let's say you have the following:

public static void main(String[] args) {
  String foo = "Hello, World!";
  Runnable r = () -> System.out.println(foo);
  r.run();
}

Remember that a lambda expression is shorthand for declaring an implementation of a functional interface. The lambda body is the implementation of the single abstract method of said functional interface. At run-time an actual object is created. So the above results in an object whose class implements Runnable.

Now, the above lambda body references a local variable from the enclosing method. The instance created as a result of the lambda expression "captures" the value of that local variable. It's almost (but not really) like you have the following:

public static void main(String[] args) {
  String foo = "Hello, World!";

  final class GeneratedClass implements Runnable {
    
    private final String generatedField;

    private GeneratedClass(String generatedParam) {
      generatedField = generatedParam;
    }

    @Override
    public void run() {
      System.out.println(generatedField);
    }
  }

  Runnable r = new GeneratedClass(foo);
  r.run();
}

And now it should be easier to see the problems with supporting concurrency here:

Local variables are not considered "shared variables". This is stated in §17.4.1 of the Java Language Specification:

Memory that can be shared between threads is called shared memory or heap memory.

All instance fields, static fields, and array elements are stored in heap memory. In this chapter, we use the term variable to refer to both fields and array elements.

Local variables (§14.4), formal method parameters (§8.4.1), and exception handler parameters (§14.20) are never shared between threads and are unaffected by the memory model.

In other words, local variables are not covered by the concurrency rules of Java and cannot be shared between threads.
At a source code level you only have access to the local variable. You don't see the generated field.

I suppose Java could be designed so that modifying the local variable inside the lambda body only writes to the generated field, and modifying the local variable outside the lambda body only writes to the local variable. But as you can probably imagine that'd be confusing and counterintuitive. You'd have two variables that appear to be one variable based on the source code. And what's worse those two variables can diverge in value.

The other option is to have no generated field. But consider the following:

public static void main(String[] args) {
  String foo = "Hello, World!";
  Runnable r = () -> {
    foo = "Goodbye, World!"; // won't compile
    System.out.println(foo);
  }
  new Thread(r).start();
  System.out.println(foo);
}

What is supposed to happen here? If there is no generated field then the local variable is being modified by a second thread. But local variables cannot be shared between threads. Thus this approach is not possible, at least not without a likely non-trivial change to Java and the JVM.

So, as I understand it, the designers put in the rule that the local variable must be final or effectively final in this context in order to avoid concurrency problems and confusing developers with esoteric problems.

+1 A far more detailed answer than the others, I was surprised to see a class with private constructor instantiated outside the class. I have seen the use of private constructor in singleton pattern only. I guess till the class(nested or even the top class) is inside the scope private constructor can be used to instantiate the class. Very well done in the answer. — VIAGC, Jun 23 '21 at 09:02
Enclosing classes can access private members of the enclosed class in Java. The above `GeneratedClass` is a local class which means it's enclosed by the class of the `main(String[])` method (note I omitted the top-level class for brevity). And I made the constructor private because that's what reflection told me when I was inspecting the class of `r` (from `Runnable r = () -> System.out.println(foo)`). — Slaw, Jun 23 '21 at 09:22
Though just to reiterate, local classes are not the actual mechanism with which lambdas are implemented. I only used a local class in the answer because it's easier to see and effectively demonstrates the problem. But it's a little more complicated what actually happens under-the-hood and I don't fully understand it all. If you're interested [this article](https://dzone.com/articles/how-lambdas-and-anonymous-inner-classesaic-work) and [this article](https://www.infoq.com/articles/Java-8-Lambdas-A-Peek-Under-the-Hood/) go into a little more detail regarding how they're implemented in OpenJDK. — Slaw, Jun 23 '21 at 09:28
Well, regardless of how they are implemented, the main point of having their own copy of the local variable remains, which is the most important. The alternative is how C# does it, which boils down to have all accesses to the “local variable” being converted to accesses to a heap variable. So then, you might have local variables that aren’t actually local variables… — Holger, Jun 23 '21 at 11:43

Nikolas Charalambidis · Answer 3 · 2021-06-23T05:56:42.810

It is for the same reason the anonymous classes require the variables used in their coming out from the scope of themselves must be read-only -> final.

final int finalInt = 0;
int effectivelyFinalInt = 0;
int brokenInt = 0;
brokenInt = 0;

Supplier<Integer> supplier = new Supplier<Integer>() {
    @Override
    public Integer get() {
        return finalInt;                        // compiles
        return effectivelyFinalInt;             // compiles
        return brokenInt;                       // doesn't compile
    }
};

Lambda expressions are only shortcuts for instances implementing the interface with only one abstract method (@FunctionalInterface).

Supplier<Integer> supplier = () -> brokenInt;   // compiles
Supplier<Integer> supplier = () -> brokenInt;   // compiles
Supplier<Integer> supplier = () -> brokenInt;   // doesn't compile

I struggle to read the Java Language specification to provide support to my statements below, however, they are logical:

Note that evaluation of a lambda expression produces an instance of a functional interface.
Note that instantiating an interface requires implementing all its abstract methods. Doing as an expression produces an anonymous class.
Note that an anonymous class is always an inner class.
Each inner class can access only final or effectively-final variables outside of its scope: Accessing Members of an Enclosing Class

In addition, a local class has access to local variables. However, a local class can only access local variables that are declared final. When a local class accesses a local variable or parameter of the enclosing block, it captures that variable or parameter.

+1 But I don't think this really answers the question, I asked how can it cause concurrency problems. You did it for anonymous classes but still why that variable has to be final or effectively final? — VIAGC, Jun 23 '21 at 05:38
@ThunderKnight My understanding: The lambda captures the variable but the object (created by the lambda) can escape the local context and potentially be acted on by multiple threads. Local variables are not part of the concurrency model of Java, as far as I know. So letting them be modified by multiple threads is not an option (or at least not a trivial change to the language). — Slaw, Jun 23 '21 at 05:54
Isn't a non-static nested class called inner class, if anonymous class is static, then won't it be a nested anonymous class. As not all nested classes are inner classes. — VIAGC, Jun 23 '21 at 06:02
@NikolasCharalambidis Can you kindly further explain what `The lambda captures the variable but the object (created by the lambda) ` means? — VIAGC, Jun 23 '21 at 06:09
@Thunder Note that was my comment, not Nikolas's. By "capture" I simply mean the value of the variable is copied to the object created by the lambda expression at run-time. — Slaw, Jun 23 '21 at 06:24

Why variable used in lambda expression should be final or effectively final

3 Answers3