62

When using HttpURLConnection does the InputStream need to be closed if we do not 'get' and use it?

i.e. is this safe?

HttpURLConnection conn = (HttpURLConnection) uri.getURI().toURL().openConnection();
conn.connect();
// check for content type I don't care about
if (conn.getContentType.equals("image/gif") return; 
// get stream and read from it
InputStream is = conn.getInputStream();
try {
    // read from is
} finally {
    is.close();
}

Secondly, is it safe to close an InputStream before all of it's content has been fully read?

Is there a risk of leaving the underlying socket in ESTABLISHED or even CLOSE_WAIT state?

Joel
  • 29,538
  • 35
  • 110
  • 138

7 Answers7

47

According to http://docs.oracle.com/javase/6/docs/technotes/guides/net/http-keepalive.html and OpenJDK source code.

(When keepAlive == true)

If client called HttpURLConnection.getInputSteam().close(), the later call to HttpURLConnection.disconnect() will NOT close the Socket. i.e. The Socket is reused (cached)

If client does not call close(), call disconnect() will close the InputStream and close the Socket.

So in order to reuse the Socket, just call InputStream.close(). Do not call HttpURLConnection.disconnect().

RamenChef
  • 5,557
  • 11
  • 31
  • 43
Anderson Mao
  • 1,101
  • 1
  • 9
  • 7
35

is it safe to close an InputStream before all of it's content has been read

You need to read all of the data in the input stream before you close it so that the underlying TCP connection gets cached. I have read that it should not be required in latest Java, but it was always mandated to read the whole response for connection re-use.

Check this post: keep-alive in java6

Andrew
  • 7,286
  • 3
  • 28
  • 38
Cratylus
  • 52,998
  • 69
  • 209
  • 339
  • Very interesting. This question is actually background for a problem I have where I'm seeing LOTS of CLOSE_WAIT sockets to the same IP, but because of caching (I don't call URLConnection.disconnect() explicitly) I expect there to be only one, which should be reused. – Joel Jan 22 '11 at 11:41
  • 4
    @Joel:By calling the `HttpUrlConnection.disconnect()` the underlying tcp socket is closed. By closing the input stream, the underlying tcp socket is pooled for later reuse.The only caveat is that the whole response (OR the whole error response) should be read from the input stream in order for the tcp connection to be cached. This has always been adviced regardless you actually need the whole data from the stream. Check the post in my answer – Cratylus Jan 22 '11 at 11:50
  • What if you don't read ANY data, as per my first example - I'm guessing you still need to close the IS, but will it still not be cached if no data is read but it's still closed. – Joel Jan 22 '11 at 12:02
  • 2
    Good article, thanks. A couple of things I'm still not clear on. 1) How long is a cached connection kept for before it is discarded? I could not see a "discard after 60s inactivity" type setting. 2) It wasn't clear to me what state the connection would be left in after calling close but not before reading all the content - it says it will not be available for re-use/caching, which is fine - but will the underlying socket actually be closed? – Joel Jan 22 '11 at 12:17
  • 1
    @Joel:Your question is related to the HTTP protocol.The connection MUST remain alive for the timeframe specified by the server in the HTTP response (server sends in HTTP header the max number of requests this connection can be used or the max time-period to keep the connection open).The http client must honor this and this is also the behavior of HttpURLConnection.If server sends no such info in the response, the connection is closed really soon (I think arround after few seconds of inactivity) not to waste resources. – Cratylus Jan 22 '11 at 12:31
  • Additionally, you need to catch IOException and close getErrorStream() if it returns non-null as stated in that "keep-alive in java6" link. – Brett Kail Feb 02 '16 at 23:02
  • Good article, thanks! But one thing I'm not clear. Since closing the input stream does not close the socket(tcp connection), can I ignore the close? For example, after I read the response, I just dont do anything, neither IS.close nor HttpUrlConnection.disconnect – aaron.chu Nov 10 '20 at 13:45
22

Here is some information regarding the keep-alive cache. All of this information pertains Java 6, but is probably also accurate for many prior and later versions.

From what I can tell, the code boils down to:

  1. If the remote server sends a "Keep-Alive" header with a "timeout" value that can be parsed as a positive integer, that number of seconds is used for the timeout.
  2. If the remote server sends a "Keep-Alive" header but it doesn't have a "timeout" value that can be parsed as a positive integer and "usingProxy" is true, then the timeout is 60 seconds.
  3. In all other cases, the timeout is 5 seconds.

This logic is split between two places: around line 725 of sun.net.www.http.HttpClient (in the "parseHTTPHeader" method), and around line 120 of sun.net.www.http.KeepAliveCache (in the "put" method).


So, there are two ways to control the timeout period:

  1. Control the remote server and configure it to send a Keep-Alive header with the proper timeout field
  2. Modify the JDK source code and build your own.

One would think that it would be possible to change the apparently arbitrary five-second default without recompiling internal JDK classes, but it isn't. A bug was filed in 2005 requesting this ability, but Sun refused to provide it.

Samuel Edwin Ward
  • 6,526
  • 3
  • 34
  • 62
7

If you really want to make sure that the connection is close you should call conn.disconnect().

The open connections you observed are because of the HTTP 1.1 connection keep alive feature (also known as HTTP Persistent Connections). If the server supports HTTP 1.1 and does not send a Connection: close in the response header Java does not immediately close the underlaying TCP connection when you close the input stream. Instead it keeps it open and tries to reuse it for the next HTTP request to the same server.

If you don't want this behaviour at all you can set the system property http.keepAlive to false:

System.setProperty("http.keepAlive","false");
Robert
  • 39,162
  • 17
  • 99
  • 152
  • 1
    Thanks. Assuming the connection is not in use do you know for how long it is cached before being closed, and is there any way to control this timeout period? – Joel Jan 22 '11 at 13:47
2

You also have to close error stream if the HTTP request fails (anything but 200):

try {
  ...
}
catch (IOException e) {
  connection.getErrorStream().close();
}

If you don't do it, all requests that don't return 200 (e.g. timeout) will leak one socket.

douglasf89
  • 226
  • 1
  • 9
Vlad Lifliand
  • 470
  • 2
  • 5
  • 1
    Not quite sure about that - that last source code (JDK 8u74) reads `public InputStream getErrorStream() { return null; }` – FelixJongleur42 Apr 15 '16 at 10:55
  • what about the `finally` block. you can use the finally to close the stream instead of `catch` – HAXM Oct 30 '18 at 09:54
  • The ErrorStream is just a buffer plus uses the InputStream. Closing the InputStream is sufficient. – bebbo Nov 24 '22 at 07:10
2

When using HttpURLConnection does the InputStream need to be closed if we do not 'get' and use it?

Yes, it always needs to be closed.

i.e. is this safe?

Not 100%, you run the risk of getting a NPE. Safer is:

InputStream is = null;
try {
    is = conn.getInputStream()
    // read from is
} finally {
    if (is != null) {
        is.close();
    }
}
Arjan Tijms
  • 37,782
  • 12
  • 108
  • 140
  • 1
    The second question was with reference to the underlying socket state, i've deliberately posted an incomplete snippet with regards to full runtime code safety. I really want to know if there's a danger of a socket left in CLOSE_WAIT or ESTABLISED by closing the socket before all content is read, – Joel Jan 22 '11 at 11:39
  • 1
    Or `IOUtils.closeQuietly(is)` – Kirby Aug 18 '14 at 17:43
  • Currently, IOUtils.closeQuietly @Deprecated – zeugor May 06 '20 at 10:22
1

Since Java 7 the recommended way is

try (InputStream is = conn.getInputStream()) {
    // read from is
    // ...
}

as for all other classes implementing Closable. close() is called at the end of the try {...} block.

Closing the input stream also means you are done with reading. Otherwise the connection hangs around until the finalizer closes the stream.

Same applies to the output stream, if you are sending data.

There is no need to get an close the ErrorStream. Even if it implements the InputStream interface: It's using the InputStream in combination with a buffer. Closing the InputStream is sufficient.

bebbo
  • 2,830
  • 1
  • 32
  • 37