2

The Issue

I have a lot of threads in the state "TIMED_WAITING" and I don't know why and how to debug it.

Debugging

I'm using spring-actuator + Grafana + Prometheus to monitor a lot of JVM + Threads params

  1. My application starts with this "snapshot" of threads:
blocked: 0
new: 0
runnable: 17
terminated: 0
time-waiting: 21
waiting: 12
  1. After an Apache Benchmark test like this: ab -s 300 -n 100000 -c 50 "http://my-address/v1/my-collection"

  2. I have this result:

enter image description here

  1. And my "snapshot" of my threads AFTER the test is like this:
blocked: 0
new: 0
runnable: 21
terminated: 0
time-waiting: 73
waiting: 12
  1. Now, if I try to run the Apache Benchmark again, with the SAME config, I'll have:
This is ApacheBench, Version 2.3 <$Revision: 1901567 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking my-address (be patient)

Completed 10000 requests
Completed 20000 requests
Completed 30000 requests
Completed 40000 requests
apr_socket_recv: Operation timed out (110)
Total of 45339 requests completed

The assumption

So, I think it shows that I'm not able to handle the concurrency generated by the stress test, because I don't have enough threads available in my thread pool. Which is the default configuration (200 and etc, look at the references. There's a link pointing to the default configuration that I'm using).

Thread Dumping

Using jstack

I did a thread dump with: jstack PID > /home/app/dump.txt. And here is my result (https://fastthread.io/my-thread-report.jsp?p=c2hhcmVkLzIwMjIvMDgvOS9kdW1wLnR4dC0tMjAtNTctMTA=&):

enter image description here

Using arthas

Also, using arthas-boot I have this:

199    ThreadPoolTaskScheduler-1              main               5            TIMED_WAITIN 0.0          0.000        0:0.009      false        false
201    ThreadPoolTaskScheduler-1              main               5            TIMED_WAITIN 0.0          0.000        0:0.008      false        false
203    ThreadPoolTaskScheduler-1              main               5            TIMED_WAITIN 0.0          0.000        0:0.005      false        false
207    ThreadPoolTaskScheduler-1              main               5            TIMED_WAITIN 0.0          0.000        0:0.005      false        false
211    ThreadPoolTaskScheduler-1              main               5            TIMED_WAITIN 0.0          0.000        0:0.005      false        false
213    ThreadPoolTaskScheduler-1              main               5            TIMED_WAITIN 0.0          0.000        0:0.005      false        false
216    ThreadPoolTaskScheduler-1              main               5            TIMED_WAITIN 0.0          0.000        0:0.004      false        false
220    http-nio-8080-exec-1                   main               5            TIMED_WAITIN 0.0          0.000        0:7.732      false        true
221    http-nio-8080-exec-2                   main               5            TIMED_WAITIN 0.0          0.000        0:7.662      false        true
222    http-nio-8080-exec-3                   main               5            TIMED_WAITIN 0.0          0.000        0:7.650      false        true
223    http-nio-8080-exec-4                   main               5            TIMED_WAITIN 0.0          0.000        0:7.611      false        true
224    http-nio-8080-exec-5                   main               5            TIMED_WAITIN 0.0          0.000        0:7.652      false        true
225    http-nio-8080-exec-6                   main               5            TIMED_WAITIN 0.0          0.000        0:7.669      false        true
226    http-nio-8080-exec-7                   main               5            TIMED_WAITIN 0.0          0.000        0:7.574      false        true
227    http-nio-8080-exec-8                   main               5            TIMED_WAITIN 0.0          0.000        0:7.615      false        true
228    http-nio-8080-exec-9                   main               5            TIMED_WAITIN 0.0          0.000        0:7.649      false        true
229    http-nio-8080-exec-10                  main               5            TIMED_WAITIN 0.0          0.000        0:7.798      false        true
230    http-nio-8080-Poller                   main               5            RUNNABLE     0.0          0.000        0:6.093      false        true
231    http-nio-8080-Acceptor                 main               5            RUNNABLE     0.0          0.000        0:4.080      false        true
232    DestroyJavaVM                          main               5            RUNNABLE     0.0          0.000        0:0.725      false        false
497    http-nio-8080-exec-11                  main               5            TIMED_WAITIN 0.0          0.000        0:7.283      false        true
498    http-nio-8080-exec-12                  main               5            TIMED_WAITIN 0.0          0.000        0:7.240      false        true
499    http-nio-8080-exec-13                  main               5            TIMED_WAITIN 0.0          0.000        0:7.256      false        true
500    http-nio-8080-exec-14                  main               5            TIMED_WAITIN 0.0          0.000        0:7.195      false        true
501    http-nio-8080-exec-15                  main               5            TIMED_WAITIN 0.0          0.000        0:7.292      false        true
502    http-nio-8080-exec-16                  main               5            TIMED_WAITIN 0.0          0.000        0:7.167      false        true
503    http-nio-8080-exec-17                  main               5            TIMED_WAITIN 0.0          0.000        0:7.204      false        true
504    http-nio-8080-exec-18                  main               5            TIMED_WAITIN 0.0          0.000        0:7.176      false        true
505    http-nio-8080-exec-19                  main               5            TIMED_WAITIN 0.0          0.000        0:7.222      false        true
506    http-nio-8080-exec-20                  main               5            TIMED_WAITIN 0.0          0.000        0:7.140      false        true
512    http-nio-8080-exec-21                  main               5            TIMED_WAITIN 0.0          0.000        0:4.739      false        true
513    http-nio-8080-exec-22                  main               5            TIMED_WAITIN 0.0          0.000        0:4.779      false        true
20     Common-Cleaner                         InnocuousThreadGro 8            TIMED_WAITIN 0.0          0.000        0:0.008      false        true
851    Keep-Alive-Timer                       InnocuousThreadGro 8            TIMED_WAITIN 0.0          0.000        0:0.000      false        true

Getting the THREAD ID 499, we have:

[arthas@31437]$ thread 499
"http-nio-8080-exec-13" Id=499 TIMED_WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2bf2651c
    at java.base@17.0.4/jdk.internal.misc.Unsafe.park(Native Method)
    -  waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2bf2651c
    at java.base@17.0.4/java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:252)
    at java.base@17.0.4/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:1672)
    at java.base@17.0.4/java.util.concurrent.LinkedBlockingQueue.poll(LinkedBlockingQueue.java:460)
    at app//org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:128)
    at app//org.apache.tomcat.util.threads.TaskQueue.poll(TaskQueue.java:33)
    at app//org.apache.tomcat.util.threads.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1113)
    at app//org.apache.tomcat.util.threads.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1176)
    at app//org.apache.tomcat.util.threads.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:659)
    at app//org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
    at java.base@17.0.4/java.lang.Thread.run(Thread.java:833)

My environment

Using spring-boot-starter-web:2.7.2, with this JVM:

❯ java --version
openjdk 17.0.4 2022-07-19 LTS
OpenJDK Runtime Environment Corretto-17.0.4.8.1 (build 17.0.4+8-LTS)
OpenJDK 64-Bit Server VM Corretto-17.0.4.8.1 (build 17.0.4+8-LTS, mixed mode, sharing)

With this startup setup:

Initial Memory (xms): 248 MB.
Max Memory (xmx): 3,968 MB.
java.lang:type=GarbageCollector,name=ZGC Cycles
java.lang:type=GarbageCollector,name=ZGC Pauses

Could you help me to debug my issue? What am I doing wrong?

References

  1. Interpreting a tomcat thread dump and diagnosing a deadlock
  2. Trying to identify source of abandoned connections in tomcat
  3. Multiple TIMED_WAITING threads seen on tomcat 7 thread dump
  4. Spring Boot - Limit on number of connections created
  5. http://www.masterspringboot.com/configuration/web-server/configuring-tomcat-connection-pool-on-spring-boot/
rios0rios0
  • 735
  • 7
  • 20

1 Answers1

0

That is expected behaviour when you are using ab (apache benchmark) - ab supports HTTP/1.0 only, that in turn means that server must close HTTP-connections, however in TCP the side which initiates shutdown gets socket in TIME_WAIT state - you may check that by issuing netstat -na command on server and count how many sockets are in TIME_WAIT state.

You may set tcp_tw_reuse (/proc/sys/net/ipv4/tcp_tw_reuse) to 1, that might help, but that does not make sense - you are using old http-client.

Andrey B. Panfilov
  • 4,324
  • 2
  • 12
  • 18
  • what tool for load testing would you recommend? According to my research, changing the "tcp_tw_reuse" parameter in a container doesn't make sense. This parameter only takes effect in the host machine. I'm trying to use the "h2load" but it's a subject for another post and another issue. – rios0rios0 Aug 11 '22 at 17:44