1

I want to recognize end of data stream in Java Sockets. When I run the code below, it just stuck and keeps running (it stucks at value 10).

I also want the program to download binary files, but the last byte is always distinct, so I don't know how to stop the while (pragmatically).

String host = "example.com";
String path = "/";
Socket connection = new Socket(host, 80);

PrintWriter out = new PrintWriter(connection.getOutputStream());
  out.write("GET "+ path +" HTTP/1.1\r\nHost: "+ host +"\r\n\r\n");
out.flush();

int dataBuffer;
while ((dataBuffer = connection.getInputStream().read()) != -1)
  System.out.println(dataBuffer);

out.close();

Thanks for any hints.

Raedwald
  • 46,613
  • 43
  • 151
  • 237
user961912
  • 157
  • 1
  • 2
  • 11

2 Answers2

7

Actually your code is not correct.

In HTTP 1.0 each connection is closed and as a result the client could detect when an input has ended.

In HTTP 1.1 with persistent connections, the underlying TCP connection remains open, so a client can detect when an input ends with 1 of the following 2 ways:

1) The HTTP Server puts a Content-Length header indicating the size of the response. This can be used by the client to understand when the reponse has been fully read.

2)The response is send in Chunked-Encoding meaning that it comes in chunks prefixed with the size of each chunk. The client using this information can construct the response from the chunks received by the server.

You should be using an HTTP Client library since implementing a generic HTTP client is not trivial (at all I may say).

To be specific in your code posted you should have followed one of the above approaches.

Additionally you should read in lines, since HTTP is a line terminated protocol.

I.e. something like:

BufferedReader in =new BufferedReader(new InputStreamReader( Connection.getInputStream() ) );
String s=null;
while ( (s=in.readLine()) != null)  {
//Read HTTP header
     if (s.isEmpty()) break;//No more headers
   }
}

By sending a Connection: close as suggested by khachik, gets the job done (since the closing of the connection helps detect the end of input) but the performance gets worse because for each request you start a new connection.

It depends of course on what you are trying to do (if you care or not)

Cratylus
  • 52,998
  • 69
  • 209
  • 339
  • The header is terminated by an empty line. Reading until readLine() returns null has exactly the same problem he started with. He should read until s.length() is zero. – user207421 Sep 29 '11 at 23:45
  • @EJP:Before you downvote you should read the answer closer.I said that he must use the HTTP headers to find the content length. The code example was to emphasize that he can parse HTTP line by line – Cratylus Sep 30 '11 at 06:06
  • @user384706 (a) you don't have any evidence about who downvoted the post; (b) I read your post closely enough to observe a flaw and offer a correction. – user207421 Sep 30 '11 at 12:01
  • @EJP:Ok concerning the downvote you are right.May be it was not fair from my side to assume that you did it.Concerning the flaw, I disagree – Cratylus Sep 30 '11 at 15:40
  • Then you are mistaken. You don't know how your own code works. – user207421 Oct 01 '11 at 10:22
  • @EJP:This code will read lineByline.The calling code should be able to retrieve each header.The HTTP headers are separated by body by an empty line.So by the `s.isEmpty()` the calling code knows when the headers end.The calling code inside the loop should try to get the `Content-Length` to find the size of the msg and use that to read the response (or handle `Chunked-Encoded` case).On top of that you can see from Google code an example that uses the same logic (http://code.google.com/intl/el-GR/appengine/docs/java/urlfetch/usingjavanet.html).Please explain the error in code analytically – Cratylus Oct 01 '11 at 10:39
  • @EJP:Otherwise you are not helping neither the OP or me from my error (assuming you commented to help us both of course).From my side if I am wrong on this I will be happy to improve my knowledge – Cratylus Oct 01 '11 at 10:39
  • *Now that you have fixed your code according to what I said in my comment,* it is correct. So how exactly is this not helping? And if it didn't have a flaw why did you fix it? – user207421 Oct 01 '11 at 13:16
  • The only "fix" I did is added the part on break if the line is empty to signify that the HTTP headers have ended.I have gave a full description on this and how to process in my answer.I don't think that your comment gave the impression that this part was missing (or why it would be needed).You did not even explain that it must be done to find the end of HTTP headers.It is like the rest of the answer was not even read – Cratylus Oct 01 '11 at 13:48
  • @user384706 this is all nonsense. I said the headers are terminated by an empty line. Your code fixes agree with what I have said here. QED. – user207421 Oct 07 '11 at 01:34
6
  1. You should use existing libraries for HTTP. See here.
  2. Your code works as expected. The server doesn't close the connection, and dataBuffer never becomes -1. This happens because connections are kept alive in HTTP 1.1 by default. Use HTTP 1.0, or put Connection: close header in your request.

For example:

out.write("GET "+ path +" HTTP/1.1\r\nHost: "+ host +"\r\nConnection: close\r\n\r\n");
out.flush();

int dataBuffer;
while ((dataBuffer = connection.getInputStream().read()) != -1) 
   System.out.print((char)dataBuffer);

out.close();
Community
  • 1
  • 1
khachik
  • 28,112
  • 9
  • 59
  • 94