0

I have a Ubuntu 16.04.5 server which runs multiple Java applications as root. The applications regularly (about every 30min to 1h) crash from a OutOfMemoryError: unable to create new native thread. Something I noticed is that applications don't crash alone, but instead multiple applications crash at the same time.

I don't know what causes this and I'm having trouble finding out what I need to change to fix the issue.

I followed some articles about the error and went through multiple possible causes, but they don't seem to apply to my situation:

Fix thread creation rate

The applications regularly create threads, but many threads die as well. This means the concurrent thread count never rises above about 10k. I checked whether I have an issue with runaway thread creation by generating thread dumps and counting threads, but the number of threads never exceeds the previously mentioned 10k.

Increase the thread limits of the os

When I run ulimit -u it returns 1546669.

ulimit -u output

This should be enough, right?

Allocate more RAM to the machine

I use about 7GB of an available 16GB RAM. This is my htop view:

htop

Additional info

Java version:

java version

The full error stacktrace of the error:

java.lang.OutOfMemoryError: unable to create new native thread
    at java.lang.Thread.start0(Native Method)
    at java.lang.Thread.start(Thread.java:717)
    at de.domisum.lib.auxilium.util.java.ThreadUtil.createAndStartThread(ThreadUtil.java:126)
    at de.domisum.lib.auxilium.util.java.ThreadUtil.createAndStartThread(ThreadUtil.java:114)
    at de.domisum.lib.auxilium.run.RunNotifyOnTimeout.run(RunNotifyOnTimeout.java:32)
    at de.domisum.lib.auxilium.util.ticker.Ticker.tickWithTimeout(Ticker.java:119)
    at de.domisum.lib.auxilium.util.ticker.Ticker.run(Ticker.java:108)
    at java.lang.Thread.run(Thread.java:748)

A thread dump from an application that suffered from the error: thread dump

Garbage collector logs: gc log1 gc log2 gc log3

Community
  • 1
  • 1
domisum
  • 531
  • 1
  • 7
  • 12
  • Just wondering: you are sure that using 10K threads is really necessary or beneficial to your workload? 10K is really a lot of a machine that has only 16 GB RAM. – GhostCat Jun 23 '19 at 17:18
  • Most of the threads are sleeping or doing network I/O at any given moment. – domisum Jun 23 '19 at 17:44
  • 1
    `ulimit -u` tests the maximum number of processes **of the shell and its children**. Are your Java commands running within a shell or within a `systemctl` service? Does each of your apps have 10K threads or all of them together? What is the value of `/proc/sys/kernel/threads-max`? – RealSkeptic Jun 23 '19 at 18:06
  • /proc/sys/kernel/threads-max is 3093339. I have one application with 10K threads, one with 500 threads and two with 30 threads. I use /etc/rc.local to call a master startup script, which in turn runs screen with the startup scripts of the java applications. You can see this in my htop screenshot. – domisum Jun 23 '19 at 18:26
  • Why are you creating so many threads? 10K threads in a single applications is excessive in my opinion. Depending on the problem you're trying to solve, it would likely be better to avoid needing to create so many, but instead use thread pools/executors. – Mark Rotteveel Jun 27 '19 at 16:15

1 Answers1

-1

The issue went away after migrating to Java 11.

domisum
  • 531
  • 1
  • 7
  • 12
  • Well yes, but this doesn't really explain the problem. Or even why it went away. Unfortunately, "migrate to Java 11" is unlikely to solve this for most people. The real problem (IMO) is an application architecture that requires so many threads. The real solution is to fix *that*. – Stephen C Jan 21 '20 at 01:44
  • You are wrong. I reworked my approach to using threads so I was only using about 300 and the issue was still there, albeit less pronounced. I got OOME about every four hours after that. I migrated to Java 11 about a week ago, and I haven't had "unable to create new native thread" a single time. I agree that my original approach wasn't elegant, but it should NOT have lead to unpredictable errors. – domisum Jan 21 '20 at 20:52
  • This is not what your answer says. Your answer implies to a reasonable reader that you just migrated to Java 11 and that fixed the problem, That's not true ... *based on what you just said*. – Stephen C Jan 21 '20 at 22:59
  • Read my comment again, slowly and carefully. After reworking the thread usage the problem was STILL THERE. I then switched from Java 8 to Java 11 and the problem was NO LONGER THERE. So migrating to Java 11 FIXED THE PROBLEM. Not very hard to understand. – domisum Jan 23 '20 at 00:55
  • Nope. The real fix for the problem was this: "I reworked my approach to using threads so I was only using about 300". Seriously, migrating did not fix the problem. It was something else. Possibly something that you were not aware that you did. (Like maybe redeploying or restarting something ... incidentally ... at the same time as you did the migration.) – Stephen C Jan 23 '20 at 01:02
  • If you update your answer to cover all if the things that you did to fix the problem, I will remove my downvote. But I do not accept it was JUST the upgrade that fixed it. No matter what you think. And, more importanly, I don't think that advising people to just upgrade is going to help >>them<<. And since real the point of answers is to help >>other people<< ..., – Stephen C Jan 23 '20 at 01:04
  • If you do still honestly believe that upgrading is the solution, provide some supporting evidence; e.g. a link to an JDK issue or a release note entry. An answer without an explanation or evidence should not give anyone any confidence that it is correct. (Like if I suggested that you can fix the problem by standing on one leg and singing "God Save The Queen". And my only evidence was "it worked for me". OK ... bad example. But you see what I am getting at. How do we know that your fix wasn't just a coincidence?) – Stephen C Jan 23 '20 at 01:13
  • I tried everything to fix this problem (migrated to another server with a different OS, dedicated instead of virtual; reduced thread count; reduced thread creation rate; and much more...), but nothing kept the error from crashing my applications. However, after I uninstalled Java 8 and installed Java 11 the error no longer appeared. I won't look for a bug fix note in the patch notes because I don't have unlimited time on my hands. Again: IT WORKED FOR ME, IT MIGHT WORK FOR OTHERS. TAKE IT OR LEAVE IT. (Back when I was trying to fix this I would have been happy about a suggestion like this) – domisum Jan 23 '20 at 01:26
  • Nobody has unlimited time on their hands. OK. You don't have time to fix your answer. It is still a poor answer. – Stephen C Jan 23 '20 at 01:44