2

I have a java application and it starts 2035 threads during the application startup and in which around 1250 threads will wait for 3 seconds until the socket time to out happen other 785 threads establishes the socket connections and completes their jobs quickly.

I can see the CPU usage is not even crossing more than 50% for half n hour when I start my application. As per my understanding if all threads fights for CPU then CPU% will increases more than by 100 % but which is not happening and my application is taking almost half n hour extra to finish the tasks if include this 1250 threads.

I can see number of Threads with top command as 196 till half n hour.I am using Linux vm with 6 cores per socket and it has 6 cores.

If I remove this 1250 threads which are waiting for socket timeout then my application execution is very fast and within 3 mins other 785 threads able to establish the socket connection and completing their jobs.

Can any one help me to understand though CPU% usage is low, which process is causing the slow down of the application?

Sreekanth R
  • 125
  • 1
  • 13
  • 1
    Java thread != OS thread – Michael May 11 '20 at 20:41
  • CPU is not the only resource threads can contend for. – Dave Costa May 11 '20 at 20:51
  • Wait! Are all of these threads executing the same code? Are they _all_ trying to connect to remote sockets? and does it just so happen that 1250 of them time out trying to talk to sockets that don't answer while 785 of them succeed? If that's the case, then your question boils down to this; "Why doesn't my program use much CPU time when it spends most of its time _waiting_ for I/O?" Waiting for I/O is not an activity that uses CPU time. – Solomon Slow May 11 '20 at 23:32
  • Yes @SolomonSlow 785 threads are succeeded to get the response and 1250 threads will wait or blocked for 3 seconds till socket timeout happen.Could you please let me know If Waiting for I/O not uses the CPU time then which process can actually cause the slowdown of my application, I can see CPU% usage is not reaching 100%. – Sreekanth R May 12 '20 at 06:49
  • 2
    Each thread has its own stack memory and has to be scheduled by the operating system to be run on one of the 6 cores and if I understand correctly they are all trying to open connections to a server that also needs to have similar number of threads? Sounds very inefficient to me and may easily fail on Operating Systems limits. A perfect application should have at most the same amount of threads as there are CPU cores available and use asynchronous tasks, maybe with framworks like these: https://www.baeldung.com/java-asynchronous-programming or Akka Actor framework or Java Flow Reactive Streams – JohannesB May 12 '20 at 08:11
  • @JohannesB There are 2035 devices connected in private network. My application threads query information using socketSend() method and out of 2035 devices only 785 devices gives back the response in quick time, remaining devices won't respond and these threads blocked till socket timeout(3 seconds) happens. – Sreekanth R May 12 '20 at 10:31
  • Is it an accepted fact that they won't respond and you have to work around that or is that part of the question? Do you specify any connection- and response timeouts? You may also want to investigate the different (low-level) approaches of (a)synchronous (non-)block IO e.g. as discussed here: https://stackoverflow.com/questions/25099640/non-blocking-io-vs-async-io-and-implementation-in-java to decide whether your current approach is the best fit for the job (And also because any framework has to be implemented on top of those low-level API's) – JohannesB May 12 '20 at 10:47
  • 1
    @JohannesB, Agreed, the OP's approach is not scaleable, but you said, "_A perfect application should have at most the same amount of threads as...CPU cores._" That's true for threads that were created for the explicit purpose of exploiting multiple-CPU cores. But, exploiting multiple CPUs is not the only reason for creating threads. There are other reasons why a programmer would want to have two or more independent activities happen concurrently in the same process. – Solomon Slow May 12 '20 at 12:56
  • If you do not want to rewrite your application but want to investigate and optimize the current implementation you may want to start with (a couple of) thread dumps and analyze them, maybe because of the large number of threads with a (free) tool like: https://fastthread.io/ or https://spotify.github.io/threaddump-analyzer/ or https://www.ibm.com/support/pages/ibm-thread-and-monitor-dump-analyzer-java-tmda or maybe try running it with a CPU Profiler (e.g.: (J)visualvm), http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html#Java and Java Flight Recorder (JFR) – JohannesB May 12 '20 at 13:48

0 Answers0