10

I have running machine with Spring (Spring Boot 1.5.2.RELEASE) application. Recently I'm receiving a lot of warnings in my log file:

.w.s.m.s.DefaultHandlerExceptionResolver : Failed to read HTTP message: org.springframework.http.converter.HttpMessageNotReadableException: Could not read document: null; nested exception is java.net.SocketTimeoutException

I've checked with tcpdump and a lot of request don't have body (empty/null) or have incorrect body e.g. \00\00\00\00\00\00\00speed":"23.3","user_id":106312}

The biggest problem is that after some time I'm starting to receiving exceptions in my app:

org.apache.tomcat.util.net.NioEndpoint   : Socket accept failed

java.io.IOException: Too many open files
    at sun.nio.ch.ServerSocketChannelImpl.accept0(Native Method)
    at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:241)
    at org.apache.tomcat.util.net.NioEndpoint$Acceptor.run(NioEndpoint.java:443)
    at java.lang.Thread.run(Thread.java:745)

ulimit -n says that limit of open files is set to 65536 so I think it's big enough.

I'm assuming that receiving big amount of invalid requests causes IOException but why? And what should I do to avoid it and fix it?

user3626048
  • 706
  • 4
  • 19
  • 52
  • So, do you for some reasons keep client connections live ? – Antoniossss Nov 02 '18 at 22:19
  • 1
    https://stackoverflow.com/questions/19636676/error-in-tomcat-too-many-open-files – Antoniossss Nov 02 '18 at 22:21
  • ulimit is system-wide resource, so it limits handle to files on system (overall). Try checking process limit. Thats in link above. – Antoniossss Nov 02 '18 at 22:31
  • @Antoniossss but I don't have any ```Stream``` in my application, I only have rest endpoints to receive data from clients – user3626048 Nov 02 '18 at 22:31
  • 2
    Check that limit. Maybe you are leaking resource handles somewhere (not on rest endpoint acceptor) – Antoniossss Nov 02 '18 at 22:33
  • Could be worth upgrading Spring Boot to the latest 1.5.X release if possible and trying to verify in case it was fixed in the many bugfixes – Darren Forsythe Nov 02 '18 at 22:43
  • @Antoniossss I have checked with ```cat /proc/PID/limits``` and there is ```Max open files 4096``` but ```ls -l /proc/PID/fd | wc -l``` says that current is ```124``` so it is way below limit but I will monitor if this number change after some time – user3626048 Nov 02 '18 at 22:45
  • Ye try to do that.Exception says you are hitting that limit (obviously) but now you know, that you dont have FFFF limit as you assumed in the first place ;) Double chekc if you are not reading files, keeping socket open or idk whatever that might be as file in linux. – Antoniossss Nov 02 '18 at 22:55
  • it's quite hard to investigate, but do you see any strange in access log and at the same time, which strange request cause the error? – Huy Nguyen Nov 03 '18 at 03:57
  • I am not so sure about IO Exception but sounds like similar to DDOS attack. Api and clients have a contract on headers and type of requests for communication between each other. If they defer we can treat them as rogue clients and should create a patch for clients or block IP of these rogue clients to not send invalid requests. On the other side we can try reduce socket connection time to see if that can help. – Anvesh Vejandla Nov 04 '18 at 03:28
  • @AnveshVejandla it's not DDOS, server works in internal network. Invalid requests are caused by malfunction of devices which send requests or very poor GSM signal (devices are connected by GSM modem). I can't block them. – user3626048 Nov 04 '18 at 17:56
  • @Antoniossss today I have 170 opened files so it increased a little bit, maybe after few days or even weeks it hits the limit and that's how i get these errors. But I'm not sure how to verify what sockets/files are opened and why (or why aren't they closing properly) – user3626048 Nov 04 '18 at 17:58
  • @huy in which access log should I check? – user3626048 Nov 04 '18 at 17:59
  • Are you sure there is no file access in your bussiness logic or something? – Antoniossss Nov 04 '18 at 18:32
  • @user3626048 are you able to validate requests or what steps have you taken to prevent invalid requests? May be something in headers you can use to validate invalid requests. When poor quality then requests might not be well formed. – Anvesh Vejandla Nov 04 '18 at 18:36
  • 3
    Please include **all** IO-related source code in your question. Your socket not opening because there are too many open files is just a symptom, the actual root of the problem can be anywhere. For example: something as simple as `Files.list()` ***will*** leave an open file descriptor unless you explicitly `close` it. – walen Nov 05 '18 at 11:43
  • Try checking the output for `ps -eo pid,ppid,cmd,%mem,%cpu --sort=-%mem | head` to identify if any of the process is consuming the resources. (we have similar issue and were able to zero down using this - so it might help you as well) – Bond - Java Bond Nov 06 '18 at 05:02
  • @Antoniossss ```ls -l /proc/PID/fd``` shows that about 95% of elements are like ```socket:[2759091]``` or ```pipe:[16453]``` etc. (only different numbers) or ```/dev/random```, the rest 5% are jvm's .jar files or two my log files (handled by ```Spring``` logger). I don't have any part of code where files are opened manually – user3626048 Nov 06 '18 at 12:09
  • @LuisG. I don't have any part of code where files are opened manually, only two log files handled by ```Spring``` logger – user3626048 Nov 06 '18 at 12:10
  • @Bond-JavaBond I've tried that but result is only my ```java -jar MYAPP.jar```, ```/usr/sbin/collectd -C /etc/``` and few ```postgres``` related processes (but with small memory and cpu usage) – user3626048 Nov 06 '18 at 12:12
  • Can't really help you if you don't show us your code. For all we know you could be doing `while(true) socket.open();` ^^ It might even _not_ be your code's problem but something else in your system using up all resources, but how would we know if we cannot see the code? – walen Nov 06 '18 at 12:17
  • Maybe you are using sockets directly (as client or server) and those are not closed somewhere. Its kind of hard to belive that your application server is leaking resources - nevertheless thats not impossible. – Antoniossss Nov 06 '18 at 12:23
  • @Antoniosss on my server I have only RESTful API and Websockets (but websockets logs debug info every few minutes and shows no pending connections). There are no direct socket connection. I found this https://stackoverflow.com/questions/36076737/springboot-tomcatembeddedservletcontainer-keepalivetimeout-not-working do you think it might be related to my issue? – user3626048 Nov 07 '18 at 08:38
  • Since there is nothing else to get a grip on, give it a try. – Antoniossss Nov 07 '18 at 09:03
  • Did you solved it? If not, have you tried going to `/etc/security/limits.conf` and adding increasing the number of possible documents that can be opened at one time? Also, are you reading from properties files? You could be opening the same file several times for some reason. [Linux conf](https://linux.die.net/man/5/limits.conf). – Alain Cruz Nov 08 '18 at 23:35

1 Answers1

1

I guess you use web socket with spring boot. You can listen on only 65535 port maximum, you can't exceed this amount of value on a machine. You should check if you close properly your sockets.

Djory Krache
  • 347
  • 4
  • 9