92

I came across a strange situation where using a parallel stream with a lambda in a static initializer takes seemingly forever with no CPU utilization. Here's the code:

class Deadlock {
    static {
        IntStream.range(0, 10000).parallel().map(i -> i).count();
        System.out.println("done");
    }
    public static void main(final String[] args) {}
}

This appears to be a minimum reproducing test case for this behavior. If I:

  • put the block in the main method instead of a static initializer,
  • remove parallelization, or
  • remove the lambda,

the code instantly completes. Can anyone explain this behavior? Is it a bug or is this intended?

I am using OpenJDK version 1.8.0_66-internal.

Reinstate Monica
  • 2,420
  • 14
  • 23
  • Reproducable with Oracle 1.8.0_66. – Kayaman Jan 15 '16 at 21:32
  • 4
    With range (0, 1) the program terminates normally. With (0, 2) or higher hangs. – Laszlo Hirdi Jan 15 '16 at 21:53
  • 5
    similar question: http://stackoverflow.com/questions/34222669/invokeandwait-with-lambda-expression-hangs-forever-in-static-initializer – Alex - GlassEditor.com Jan 15 '16 at 23:06
  • 2
    Actually it is exactly the same question/issue, just with a different API. – Didier L Jan 15 '16 at 23:15
  • 3
    You are trying to use a class, in a background thread, when you haven't finished initialising the class so it can't be used in a background thread. – Peter Lawrey Jan 15 '16 at 23:16
  • 1
    @PeterLawrey Put that way it sounds obvious, but it isn't at all obvious that the lambda needs to use the outer class. – Reinstate Monica Jan 16 '16 at 00:16
  • 5
    @Solomonoff'sSecret as `i -> i` is not a method reference it is a `static method` implemented in the Deadlock class. If replace `i -> i` with `Function.identity()` this code should be fine. – Peter Lawrey Jan 16 '16 at 00:20
  • @PeterLawrey `i -> i` becomes an implementation of a functional interface that behaves as a real object. Doesn't that mean it needs to be invoked polymorphically and if so, how can it simply be a static method? Doesn't it at least need to have a method that implements the method of the functional interface which in the current implementation delegates to the static method? Or is there some magic happening behind the scenes? – Reinstate Monica Jan 16 '16 at 00:26
  • 1
    @Solomonoff'sSecret There is a class generated at runtime with one method which calls the static method in the class you have defined. – Peter Lawrey Jan 16 '16 at 14:24
  • Related to [Are Java static initializers thread safe?](https://stackoverflow.com/questions/878577/are-java-static-initializers-thread-safe); – Raedwald Nov 02 '18 at 13:34
  • I can't understan why sometimes it hangs but sometimes not. I always run app with the same arguments. – gstackoverflow Dec 10 '18 at 17:23

3 Answers3

74

I found a bug report of a very similar case (JDK-8143380) which was closed as "Not an Issue" by Stuart Marks:

This is a class initialization deadlock. The test program's main thread executes the class static initializer, which sets the initialization in-progress flag for the class; this flag remains set until the static initializer completes. The static initializer executes a parallel stream, which causes lambda expressions to be evaluated in other threads. Those threads block waiting for the class to complete initialization. However, the main thread is blocked waiting for the parallel tasks to complete, resulting in deadlock.

The test program should be changed to move the parallel stream logic outside of the class static initializer. Closing as Not an Issue.


I was able to find another bug report of that (JDK-8136753), also closed as "Not an Issue" by Stuart Marks:

This is a deadlock that is occurring because the Fruit enum's static initializer is interacting badly with class initialization.

See the Java Language Specification, section 12.4.2 for details on class initialization.

http://docs.oracle.com/javase/specs/jls/se8/html/jls-12.html#jls-12.4.2

Briefly, what's happening is as follows.

  1. The main thread references the Fruit class and starts the initialization process. This sets the initialization in-progress flag and runs the static initializer on the main thread.
  2. The static initializer runs some code in another thread and waits for it to finish. This example uses parallel streams, but this has nothing to do with streams per se. Executing code in another thread by any means, and waiting for that code to finish, will have the same effect.
  3. The code in the other thread references the Fruit class, which checks the initialization in-progress flag. This causes the other thread to block until the flag is cleared. (See step 2 of JLS 12.4.2.)
  4. The main thread is blocked waiting for the other thread to terminate, so the static initializer never completes. Since the initialization in-progress flag isn't cleared until after the static initializer completes, the threads are deadlocked.

To avoid this problem, make sure that a class's static initialization completes quickly, without causing other threads to execute code that requires this class to have completed initialization.

Closing as Not an Issue.


Note that FindBugs has an open issue for adding a warning for this situation.

Tunaki
  • 132,869
  • 46
  • 340
  • 423
  • 21
    _"This was considered when we designed the feature"_ and _"We know what causes this bug but not how to fix it"_ do **not** mean _"this is not a bug"_. This is absolutely a bug. – BlueRaja - Danny Pflughoeft Jan 16 '16 at 00:39
  • so, no lambda in static initializers? This smells like a bug and stings like a bug. – ZhongYu Jan 16 '16 at 01:09
  • 14
    @bayou.io The main issue is using threads within static initializers, not lambdas. – Stuart Marks Jan 16 '16 at 23:31
  • 6
    BTW Tunaki thanks for digging up my bug reports. :-) – Stuart Marks Jan 16 '16 at 23:32
  • @StuartMarks - How can you blame people for using parallel stream? That's the biggest selling point of `Stream`, at least from Goetz. The thing is, once you hand off a functional object to another API, you can't be sure how it is gonna be used. If the function is created by an anonymous class that is coupled to the class being initialized, an alarm instantly sounds off. However, in this case of lambda, the coupling is implicit and incidental; and that is the main pitfall here. – ZhongYu Jan 17 '16 at 03:50
  • Secondly, such deadlock can exist without "using threads". Imagine the static initializer of class `A` registers a lambda to a registry; the lambda seems self-contained and immediately usable. Now anyone else (on other threads) trying to use the lambda could trigger deadlocks. – ZhongYu Jan 17 '16 at 03:53
  • 1
    There is absolutely no problem that we have to live with some limitations of implementations; everybody would be understanding of that. But I don't think it is right, in this particular case, to simply chalk it up as a non-issue, and blame all use cases that reveal the problem. – ZhongYu Jan 17 '16 at 03:56
  • 15
    @bayou.io: it’s the same thing on class level as it would be in a constructor, letting `this` escape during object construction. The basic rule is, don’t use multi-threaded operations in initializers. I don’t think that this is hard to understand. Your example of registering a lambda implemented function into a registry is a different thing, it doesn’t create deadlocks unless you are going to wait for one these blocked background threads. Nevertheless, I strongly discourage from doing such operations in a class initializer. It’s not what they are meant for. – Holger Jan 18 '16 at 12:07
  • 11
    I guess the programming style lesson is: keep static initalizers simple. – Raedwald Jan 19 '16 at 08:20
20

For those who are wondering where are the other threads referencing the Deadlock class itself, Java lambdas behave like you wrote this:

public class Deadlock {
    public static int lambda1(int i) {
        return i;
    }
    static {
        IntStream.range(0, 10000).parallel().map(new IntUnaryOperator() {
            @Override
            public int applyAsInt(int operand) {
                return lambda1(operand);
            }
        }).count();
        System.out.println("done");
    }
    public static void main(final String[] args) {}
}

With regular anonymous classes there is no deadlock:

public class Deadlock {
    static {
        IntStream.range(0, 10000).parallel().map(new IntUnaryOperator() {
            @Override
            public int applyAsInt(int operand) {
                return operand;
            }
        }).count();
        System.out.println("done");
    }
    public static void main(final String[] args) {}
}
chwarr
  • 6,777
  • 1
  • 30
  • 57
Tamas Hegedus
  • 28,755
  • 12
  • 63
  • 97
  • This is so very strange. Could you provide a citation or an explanation as to why lambdas behave that way? I always thought they were equivalent to an anonymous class but you are right that an anonymous class doesn't deadlock. – Reinstate Monica Jan 15 '16 at 21:59
  • @Solomonoff'sSecret I buried myself in the [spec](https://docs.oracle.com/javase/specs/jls/se8/html/jls-15.html#jls-15.27), but didn't find anything relevant. – Tamas Hegedus Jan 15 '16 at 22:27
  • 7
    @Solomonoff'sSecret It's an implementation choice. The code in the lambda has to go somewhere. Javac compiles it into a static method in the containing class (analogous to `lambda1` i this example). Putting each lambda into its own class would have been considerably more expensive. – Stuart Marks Jan 15 '16 at 23:30
  • 2
    @StuartMarks Given that the lambda creates a class implementing the functional interface, wouldn't it be just as efficient to put the implementation of the lambda in the implementation of the functional interface's lambda as in the second example of this post? That's certainly the obvious way to do things but I'm sure there's a reason why they're done the way they are. – Reinstate Monica Jan 16 '16 at 00:20
  • 6
    @Solomonoff'sSecret The lambda might create a class at runtime (via [java.lang.invoke.LambdaMetafactory](https://docs.oracle.com/javase/8/docs/api/java/lang/invoke/LambdaMetafactory.html)), but the lambda body must be placed somewhere at compile time. The lambda classes can thus take advantage of some VM magic to be less expensive than normal classes loaded from .class files. – Jeffrey Bosboom Jan 16 '16 at 02:04
  • 1
    @Solomonoff'sSecret Yes, Jeffrey Bosboom's reply is correct. If in a future JVM it becomes possible to add a method to an existing class, the metafactory might do that instead of spinning a new class. (Pure speculation.) – Stuart Marks Jan 16 '16 at 02:38
  • @StuartMarks Perhaps a better implementation would be to create a class Deadlock$Lambdas with static methods for all the lambdas. Then the lambdas wouldn't depend on Deadlock. Granted the benefit would be extremely slim and this implementation might have a speed/memory penalty due to increasing the number of classes. – Reinstate Monica Jan 16 '16 at 16:19
  • @Solomonoff'sSecret -- That is only needed for lambdas created during static initialization. Unfortunately, that is a runtime property; `javac` can't be able to find all such lambdas. At this point, it falls on java programmers to be aware of this issue, and manually add a separate class to work around the problem. – ZhongYu Jan 16 '16 at 18:09
  • @bayou.io The idea was not to use static analysis to determine which lambdas are used in static initializers. The idea was that *all* lambdas (say, defined in a static context) would have their implementations in a separate class. But I suppose that isn't viable in general because the lambda could use a field in the original static class, so the lambda may need to be able to see the static class in which it's written. – Reinstate Monica Jan 16 '16 at 18:31
  • @Solomonoff'sSecret -- non-static code can be invoked during static initialization too. It is a tricky phase that requires careful reasoning. – ZhongYu Jan 16 '16 at 19:16
  • @bayou.io Right, just like a constructor can view an uninitialized field by invoking a method. However, the compiler does take *partial* measures to protect you against that. I'm fine with the current behavior but it is unintuitive so if there were some easy way to prevent most errors with no collateral damage, it would be nice. – Reinstate Monica Jan 16 '16 at 19:19
  • 4
    @Solomonoff's Secret: don’t judge by looking at such trivial lambda expressions like your `i -> i`; they won’t be the norm. Lambda expressions may use all members of their surrounding class, including `private` ones, and that makes the defining class itself their natural place. Letting all these use cases suffer from an implementation optimized for the special case of class initializers with multi-threaded use of trivial lambda expressions, not using members of their defining class, is not a viable option. – Holger Jan 18 '16 at 12:17
  • @Holger You are right, my suggestion isn't workable in general. – Reinstate Monica Jan 18 '16 at 14:27
  • On my PC, first snippet sometimes leads to deadlock but sometimes - not. Is it expected begaviour? – gstackoverflow Dec 10 '18 at 13:38
  • @gstackoverflow As far as I remember the first snippet caused a deadlock for me every time, but it is nit guaranteed, depends on the implementation of parallel streams. If you need more specifics I might look into it abit more for you. – Tamas Hegedus Dec 12 '18 at 18:09
18

There is an excellent explanation of this problem by Andrei Pangin, dated by 07 Apr 2015. It is available here, but it is written in Russian (I suggest to review code samples anyway - they are international). The general problem is a lock during class initialization.

Here are some quotes from the article:


According to JLS, every class has a unique initialization lock that is captured during initialization. When other thread tries to access this class during initialization, it will be blocked on the lock until initialization completes. When classes are initialized concurrently, it is possible to get a deadlock.

I wrote a simple program that calculates the sum of integers, what should it print?

public class StreamSum {
    static final int SUM = IntStream.range(0, 100).parallel().reduce((n, m) -> n + m).getAsInt();

    public static void main(String[] args) {
        System.out.println(SUM);
    }
} 

Now remove parallel() or replace lambda with Integer::sum call - what will change?

Here we see deadlock again [there were some examples of deadlocks in class initializers previously in the article]. Because of the parallel() stream operations run in a separate thread pool. These threads try to execute lambda body, which is written in bytecode as a private static method inside StreamSum class. But this method can not be executed before the completion of class static initializer, which waits the results of stream completion.

What is more mindblowing: this code works differently in different environments. It will work correctly on a single CPU machine and will most likely hang on a multi CPU machine. This difference comes from the Fork-Join pool implementation. You can verify it yourself changing the parameter -Djava.util.concurrent.ForkJoinPool.common.parallelism=N

Community
  • 1
  • 1
AdamSkywalker
  • 11,408
  • 3
  • 38
  • 76