34

I work on a Jetty web app that was running on Java 16. I tried to upgrade it to Java 17 but there were critical performance issues caused entirely by one call to parallelStream().

The only changes are the Java version bump from 16 to 17, --add-opens java.base/java.lang=ALL-UNNAMED --add-opens java.base/java.util=ALL-UNNAMED and the runtime bump from openjdk:16.0.1-jdk-oraclelinux8 to openjdk:17.0.1-jdk-oraclelinux8.

We managed to obtain a thread dump and it contains many of these:

"qtp1368594774-200" #200 prio=5 os_prio=0 cpu=475.94ms elapsed=7189.65s tid=0x00007fd49c50cc10 nid=0xd1 waiting on condition  [0x00007fd48fef7000]
   java.lang.Thread.State: WAITING (parking)
    at jdk.internal.misc.Unsafe.park(java.base@17.0.1/Native Method)
    - parking to wait for  <0x00000007b73439a8> (a java.util.stream.ReduceOps$ReduceTask)
    at java.util.concurrent.locks.LockSupport.park(java.base@17.0.1/LockSupport.java:341)
    at java.util.concurrent.ForkJoinTask.awaitDone(java.base@17.0.1/ForkJoinTask.java:468)
    at java.util.concurrent.ForkJoinTask.invoke(java.base@17.0.1/ForkJoinTask.java:687)
    at java.util.stream.ReduceOps$ReduceOp.evaluateParallel(java.base@17.0.1/ReduceOps.java:927)
    at java.util.stream.AbstractPipeline.evaluate(java.base@17.0.1/AbstractPipeline.java:233)
    at java.util.stream.ReferencePipeline.collect(java.base@17.0.1/ReferencePipeline.java:682)
    at com.stackoverflowexample.aMethodThatDoesBlockingIOUsingParallelStream()

The code that is causing the issue is something like:

list.parallelStream()
.map(this::callRestServiceToGetSomeData)
.collect(Collectors.toUnmodifiableList());

This image shows thread use before upgrading from jdk16 (LHS), upgrading to jdk17 (the huge spike in the middle), then removing the call to parallelStream() still on jdk17 (RHS):

Threads

What change in Java 17 (openjdk-17.0.1_linux-x64_bin.tar.gz) has caused this?

Robert Bain
  • 9,113
  • 8
  • 44
  • 63
  • Current version is 17.0.1+12 - have you tried to upgrade to a more recent version (see https://adoptium.net/index.html)? – stdunbar Dec 16 '21 at 19:53
  • 7
    Thank you! I haven't tried that but I have had a look at the release notes and there's nothing that stands out. Unfortunately I can only reproduce this in production. A call to `parallelStream()` for blocking I/O particularly in the context of a servlet container is a no-no, removing it fixed my issue. I can't justify bumping the jdk and removing the fix to satisfy my curiosity. Hopefully someone close to the jdk will be able to weigh in. – Robert Bain Dec 16 '21 at 20:04
  • Also check what other tasks may be using the same fork join pool, it does not look like the thread name you show is derived from the common pool name. – DuncG Dec 16 '21 at 22:47
  • 1
    @DuncG this is something I thought about only an hour or so ago. We have 2 RDBMS datasources, one of them was not returning connections to the pool. Active connections ramping up in direct proportion to the thread use. Same pooling impl, different RDBMS vendors. I started to wonder if perhaps something in the JDBC driver or somewhere in that chain is using the same pool. – Robert Bain Dec 16 '21 at 22:51
  • 2
    It sounds like you are considering that this might not be directly caused by a change in Java 17. That is a good strategy. (Correlation does not necessarily mean causation.) – Stephen C Dec 17 '21 at 01:05
  • 33
    Using blocking I/O with parallel streams never worked well. Seems you’ve been lucky in the past. – Holger Dec 17 '21 at 09:09
  • @StephenC the only change that went in was literally the java bump, a couple of `add-opens` and a bump of the openjdk docker image. I don't imagine this is an OS level change (happy to stand corrected). My current thinking is we shouldn't have been using `parallelStream()` the way we were, there have been jdk changes and we've been caught with our trousers down. I'd love to know what change went in, if any, in that area. – Robert Bain Dec 17 '21 at 22:13
  • @RobertBain Upgrade minor version to minor version and you can narrow it down as though you're debugging commits - does that yield anything useful? – Ermiya Eskandary Dec 26 '21 at 19:37
  • 1
    @RobertBain we have the same issue. Only JDK bump and parallel streams blow up. I agree that doing I/O there is bad (we do the same) but I really want to understand the root cause - what was changed in JDK 16 or 17 (we upgraded from 15). – Krzysztof Wolny Feb 09 '22 at 10:38
  • 9
    @KrzysztofWolny an example program capable of reproducing the problem would be helpful. – Holger Feb 09 '22 at 13:10
  • @Holger it's as simple as in the original post: parallel streams with some long-running I/O operations. I have exactly the same observations of threads as it was posted: no. busy workers growing, some of them are returned (small drops on the chart) but most of them is stuck in WAITING/TIMED_WAITING state. – Krzysztof Wolny Feb 10 '22 at 08:14
  • @KrzysztofWolny my thoughts based on what I was seeing were that the thread pool used by the blocking calls was now being shared, perhaps by the postgres jdbc driver and the blocking calls were preventing db access. Are you using jdbc? – Robert Bain Feb 10 '22 at 22:23
  • @RobertBain nope, in my case we do REST calls in parallel. But this is irrelevant to what you do, as it used to work. My goal is to find why it was working and why now it is not... – Krzysztof Wolny Feb 14 '22 at 14:19
  • 5
    @KrzysztofWolny - *"it's as simple as in the original post:"* - Well please write it as [minimal reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) and post it. I don't think there will be an answer to this question unless someone does that work. – Stephen C Jul 15 '22 at 09:45
  • Did you give it a try with java 18 also? In case is something in jdk, maybe that would be solved. – rapaio Jul 29 '22 at 22:23
  • 1
    There have been quite a few problems listed on the internet causing debug testing with parked threads and java.base because of difference between 16 and 17 lodge a BUG report on that jdk. Possibly also with the JDK's a NUMA problem too. – Samuel Marchant Sep 25 '22 at 14:06

1 Answers1

1

We all know or be told that creating a new thread is a heavy operation. But it seems okay to me after ran a few tests. For example: Here is the memory usage by run below simple test with 10_000 thread. It took about 2 or 3 seconds on my laptop and jvm usage is about 1.5 G.

final int threadNum = 10_000;

final Callable<String> task = () -> {
    String bigString = UUID.randomUUID().toString().repeat(1000);
    assertTrue(bigString.chars().sum() > 0);

    Thread.currentThread().sleep(1000);

    return bigString;
};

final ExecutorService executorService = Executors.newFixedThreadPool(threadNum);
final List<Future<String>> futures = new ArrayList<>(threadNum);

for (int i = 0; i < threadNum; i++) {
    futures.add(executorService.submit(task));
}

long ret = futures.stream().map(Fn.futureGet()).mapToInt(String::length).sum();
System.out.println(ret);
assertEquals(UUID.randomUUID().toString().length() * threadNum * 1000, ret);

enter image description here

I think it's a rare chance that 10_000 will be created/used in most of applications. If I changed the thread number to 1000. Again it took 2 or 3 seconds and memory usage is about: 300 MB. enter image description here

Is it possible or a good idea to use stream api to run blocking I/O call in parallel? I think so. Here is a sample with my tool: abacus-common

// Run above task by Stream.
ret = IntStreamEx.range(0, threadNum)
        .parallel(threadNum)
        .mapToObj(it -> Try.call(task))
        .sequential()
        .mapToInt(String::length)
        .sum();

// Or other  task
StreamEx.of(list)
        .parallel(64) // Specify the concurrent thread number. It could be from 1 up to thousands.
        .map(this::callRestServiceToGetSomeData)
        .collect(Collectors.toUnmodifiableList());

Or use Virtual Threads introduced in Java 19+

try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
    StreamEx.of(list)
    .parallel(executor)
    .map(this::callRestServiceToGetSomeData)
    .collect(Collectors.toUnmodifiableList());
}

I know this is not a direct answer to the question. But it may resolve the original problem which brought up this question.

user_3380739
  • 1
  • 14
  • 14