4

I'm reading this fantastic article about Lambda Expressions and the following is uncleared to me:

  1. Does Lambda Expression saves the value of the free-variables or refernse/pointer to each of them? (I guess the answer is the latter because if not, mutate free-variables would be valid).

Don't count on the compiler to catch all concurrent access errors. The prohibition against mutation holds only for local variables.

I'm not sure that self experimenting would cover all the cases so I'm searching for a well defined rules about:

  1. What free varibles can be mutated inside the Lambda Expression (static/properties/local variables/parameters) and which can be mutated out side while beeing used inside a Lambda Expression?
  2. Can I mutate every free variable after the end of a block of a Lambda Expression after I used it (read or called one of his methods) inisde a Lambda Expression?

Don't count on the compiler to catch all concurrent access errors. The prohibition against mutation holds only for local variables. If matchesis an instance or static variable of an enclosing class, then no error is reported, even though the result is just as undefined.

  1. Does the result of the mutation is undefined even when I use a synchroniziton algorithm?

Update 1:

free variables - that is, the variables that are not parameters and not defined inside the code.

In simple words I can conclude that Free variables are all the variables that are not parameters of the Lambda Expression and are not defined inside the same Lambda Expression ?

Stav Alfi
  • 13,139
  • 23
  • 99
  • 171

2 Answers2

6

Your term “free variables” is misleading at best. If you’re not talking about local variables (which must be effectively final to be captured), you are talking about heap variables.

Heap variables might be instance fields, static fields or array elements. For unqualified access to instance variables from the surrounding context, the lambda expression may (and will) access them via the captured this reference. For other instance fields, as well as array elements, you need an explicit access via a variable anyway, so it’s clear, how the heap variable will be accessed. Only static fields are accessed directly.

The rules are simple, unless being declared final, you can modify all of them, inside or outside the lambda expression. Keep in mind that lambda expressions can call arbitrary methods, containing arbitrary code anyway. Whether this will cause problems, depends on how you use the lambda expressions. You can even create problems with functions not directly modifying a variable, without any concurrency, e.g.

ArrayList<String> list=new ArrayList<>(Arrays.asList("foo", "bar"));
list.removeIf(s -> list.remove("bar"));

may throw a java.util.ConcurrentModificationException due to the list modification in an ongoing iteration.

Likewise, modifying a variable or resource in a concurrent context might break it, even if you made sure that the modification of the variable itself has been done in a thread-safe manner. It’s all about the contracts of the API you are using.

Most notably, when using parallel Streams, you have to be aware that functions are not only evaluated by different threads, they are also evaluating arbitrary elements of the Stream, regardless of their encounter order. For the final result of the Stream processing, the implementation will assemble partial results in a way that reestablishes the encounter order, if necessary, but the intermediate operations evaluate the elements in an arbitrary order, hence your functions must not only be thread safe, but also not rely on a particular processing order. In some cases, they may even process elements not contributing to the final result.

Since your bullet 3 refers to “after the end of a block”, I want to emphasize that it is irrelevant at which place inside your lambda expression the modification (or perceivable side effect) happens.

Generally, you are better off with functions not having such side effects. But this doesn’t imply that they are forbidden in general.

Holger
  • 285,553
  • 42
  • 434
  • 765
6

This looks like complicated "words" on a simpler topic. The rules are pretty much the same as for anonymous classes.

For example the compiler catches this:

 int x = 3;

 Runnable r = () -> {
    x = 6; // Local variable x defined in an enclosing scope must be final or effectively final
 };

But at the same time it is perfectly legal to do this(from a compiler point of view):

    final int x[] = { 0 };

    Runnable r = () -> {
        x[0] = 6;
    };

The example that you provided and uses matches:

 List<Path> matches = new ArrayList<>();
    List<Path> files = List.of();
    for (Path p : files) {
        new Thread(() -> {
            if (1 == 1) {
                matches.add(p);
            }
        }).start();
    }

has the same problem. The compiler does not complain about you editing matches(because you are not changing the reference matches - so it is effectively final); but at the same time this can have undefined results. This operation has side-effects and is discouraged in general. The undefined results would come from the fact that your matches is not a thread-safe collection obviously.

And your last point : Does the result of the mutation is undefined even when I use a synchroniziton algorithm?. Of course not. With proper synchronization updating a variable outside lambda(or a stream) will work - but are discouraged, mainly because there would be other ways to achieve that.

EDIT

OK, so free variables are those that are not defined within the lambda code itself or are not the parameters of the lambda itself.

In this case the answer to 1) would be: lambda expressions are de-sugared to methods and the rules for free-variables are the same as for anonymous classes. This has been discussed numerous times, like here. This actually answers the second question as well - since the rules are the same. Obviously anything that is final or effectively final can be mutated. For primitives - this means they can't be mutated; for objects you can't mutate the references (but can change the underlying data - as shown in my example). For the 3) - yes.

Community
  • 1
  • 1
Eugene
  • 117,005
  • 15
  • 201
  • 306
  • 4
    The term “undefined” is itself not very well defined. As explained in my answer, you can be thread safe with your variable updates, so your results are not as undefined as “`C++` undefined behavior”, but still have unpredictable wrong results. So for your example, changing `matches` to use `Vector` instead of `ArrayList` would make the modification thread safe, still, the result would be undefined, in the sense of having no predictable order, and without an explicit waiting for completion, you even can’t assume to see all matching elements. – Holger May 08 '17 at 12:18
  • @Holger absolutely yes, thank you for the comment I just could not find this exceptionally good wording.. :( – Eugene May 08 '17 at 12:29
  • First of all thank you but I can't find where you answered any of my questions except the last one. – Stav Alfi May 08 '17 at 12:54
  • @StavAlfi I've tried to answer all of them... See EDIT – Eugene May 08 '17 at 13:11
  • Okay thanks, your referense to Jon Skeet answer was very helpful. For question 3, I can't mutate a non-final primitives that I read inside a lmbda after the lambda. @Holger Thanks for the comment, my last question was if its undefined beacuse of concurrent mutation or c++ undefined behavior. – Stav Alfi May 08 '17 at 14:30