1

In the following, I am using Play framework 2.4.0 with Scala 2.11.7.

I am stressing a simple play app with Gatling, injecting 5000 users over 60 seconds and in a few seconds, the play server returns the following:

"Failed to accept a connection." and "java.io.IOException: Too many open files in system".

Here is the associated stacktrace:

22:52:48.943 [application-akka.actor.default-dispatcher-12] INFO  play.api.Play$ - Application started (Dev)
22:53:08.939 [New I/O server boss #17] WARN  o.j.n.c.s.nio.AbstractNioSelector - Failed to accept a connection.
java.io.IOException: Too many open files in system
        at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method) ~[na:1.8.0_45]
        at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:422) ~[na:1.8.0_45]
        at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:250) ~[na:1.8.0_45]
        at org.jboss.netty.channel.socket.nio.NioServerBoss.process(NioServerBoss.java:100) [netty-3.10.3.Final.jar:na]
        at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337) [netty-3.10.3.Final.jar:na]
        at org.jboss.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42) [netty-3.10.3.Final.jar:na]
        at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) [netty-3.10.3.Final.jar:na]
        at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) [netty-3.10.3.Final.jar:na]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_45]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_45]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]

I suppose this is due to the ulimit of the system (would it be possible to confirm that?), and if so, my question is the following:

How this kind of error is managed in production environment? Is this by setting a high value with ulimit -n <high_value>?

Artjom B.
  • 61,146
  • 24
  • 125
  • 222
  • possible duplicate of [Java Too Many Open Files](http://stackoverflow.com/questions/4289447/java-too-many-open-files) – childofsoong Jul 22 '15 at 21:31
  • This error is avoided by increasing the limits on the number of open files for certain users or globally. This is an OS-level question and the answer depends on your OS type and version. Suggest determining that and then search for a specific procedure for your platform. For CentOS/Fedora, a decent procedure is at http://pro.benjaminste.in/post/318453669/increase-the-number-of-file-descriptors-on-centos. –  Jul 22 '15 at 21:35
  • Thank you for your answers. Is the increase of the limit the only/the best solution for production environments? Moreover, as stated in my other comment, is there any drawbacks for the file system, the OS or other running applications, concerning the latency, performances, or anything else? Or is it the "ulimit -n with a high value" is simply the standard solution without drawback? What if the problem persists for the play application even with the highest ulimit for the OS? Thank you! –  Jul 23 '15 at 18:40

1 Answers1

5

Most foolproof way to check is to:

cat /proc/PID/limits

You will see:

Limit                     Soft Limit           Hard Limit           Units
Max cpu time              unlimited            unlimited            seconds
...
Max open files            1024                 1024                 files
...

To see your current shell, you can always:

ulimit -a

And get

...
open files                      (-n) 1024
...

Changing it is best done system wide via /etc/security/limits.conf But you can use ulimit -n to change for only the current shell.

In terms of how to deal with this situation, there's simply no alternative to having enough file descriptors. Set it high and if it hits, there is a leak or you work for facebook. In production, I believe the general recommendation is 16k.

nkadwa
  • 839
  • 8
  • 16
  • Thank you for the answer! However, I was not asking for a solution to solve the ulimit problem, but instead how this kind of error is managed in production. In other words, is it a good solution to increase this limit of open files? Is there any drawbacks for the file system, latency, performance for other running applications, or anything else? Or is it the ulimit -n with a high value is already the chosen solution? What if the problem persists for the play application even with the highest ulimit for this OS? Thank you! –  Jul 23 '15 at 18:34
  • Okay. There's simply no alternative to having enough file descriptors. Set it high and if it hits, there is a leak or you work for facebook. In production, I believe the general recommendation is 16k. – nkadwa Jul 23 '15 at 20:21
  • thank you! I will accept that answer. By the way, is there any way to detect such possible leak in a play application? –  Jul 24 '15 at 15:53
  • Run JMeter against the API or ab (apache bench) for a while and see if you can get it to crash. – nkadwa Jul 24 '15 at 18:49
  • I have already done that with Gatling and the app crash far before 5000 users but I do not know how to identify what is created such files? –  Jul 24 '15 at 19:11
  • Most likely scenario is that you're not closing files or sockets that you're opening. The open source code you use is the absolute last place you should look. – nkadwa Jul 24 '15 at 19:13
  • I didn't get the Facebook pun. – Jus12 Jan 19 '20 at 02:31