3
  1. How does the InputStream.read(byte[]) method know if the "End of Stream" has been reached and return "-1" ?

  2. What are all the conditions for returning "-1" ?

  3. How to detect an "End of Stream" (without sending an integer which contains the total number of bytes to read before) ?

Example of use:

InputStream input = socket.getInputStream();
ByteArrayOutputStream baos = new ByteArrayOutputStream();
byte[] buffer = new byte[1024];
for(int size = -1; (size = input.read(buffer)) != -1; ) {
    baos.write(buffer, 0, size);
}
user207421
  • 305,947
  • 44
  • 307
  • 483
gokan
  • 1,028
  • 1
  • 14
  • 30
  • 2
    If it's a TCP socket, because the other side has sent a TCP handshake message to close the connection. – Jesper Jun 08 '15 at 14:56
  • And if it isn't a TCP socket it doesn't deliver -1 at all, because the only other socket Java supports is for UDP, which isn't a stream protocol and doesn't have an end of stream at all. – user207421 Jun 11 '21 at 06:10
  • @Nayuki Why did you remove the [tag:tcp] tag? That's what the question is *about.* – user207421 Jun 11 '21 at 09:28

2 Answers2

2

InputStream is an abstract type with many implementations. A FileInputStream, for example, will return -1 if you have reached the end of the file. If it's a TCP socket, it will return -1 if the connection has been closed. It is implementation-dependent how end-of-stream is determined.

Louis Wasserman
  • 191,574
  • 25
  • 345
  • 413
  • 3
    Judging from the title of the question, I think it's the socket implementation that is of interest here. – aioobe Jun 08 '15 at 15:05
-1

It does not.

When you try to read n bytes from socket, the call may return before n bytes are ready, and number of bytes read is returned. How does read() decide to return? Based on timeout. The timeout value is commented as SO_TIMEOUT in AbstractPlainSocketImpl.java. Actually, real read happens with the native code, probably written in C, the SO_TIMEOUT defaults to whatever native code has. However, you can set timeout value with Socket.setSocketTimeout(millis).

SocketInputStream.java

        n = socketRead(fd, b, off, length, timeout);
        if (n > 0) {
            return n;
        }

If you observe HTTP protocol, the client and server coordinate using the content-length header to indicate each other when a request and response is ending, and when a new request and response is starting. The order of bytes received is taken care by the TCP layer.

Socket stream does not have end of stream check, like feof check with files. Its a two way communication read and write. However, you can check if bytes are available to read. TCP connections are live until either client or server chooses to close.

  • 1
    It does. TCP sockets deliver end of stream when the peer closes the connection. Your first and final paragraphs contradict this. – user207421 Jun 11 '21 at 06:11
  • ... and files do not have EOF markers. And `read()` does not decide on when to return based on a timeout if data or end of stream has already been received. This answer reads like mere guesswork. – user207421 Jun 11 '21 at 06:35
  • **SocketInputStream.java** --> int read(byte b[], int off, int length, int timeout) throws IOException --> `n = socketRead0(fd, b, off, length, timeout); ` **SocketInputStream .java** --> `private native int socketRead0(FileDescriptor fd, byte b[], int off, int len, int timeout) throws IOException;` **SocketInputStream.c** --> `nread = recv(fd, bufP, len, 0);` https://man7.org/linux/man-pages/man2/recv.2.html https://www.gnu.org/software/libc/manual/html_node/EOF-and-Errors.html – Janardhan B. Chinta Jun 11 '21 at 12:40
  • I dug further to see how the EOF is decided. File pointer keeps track of EOF via flags. **struct_FILE.h** --> `#define _IO_EOF_SEEN 0x0010` **struct_FILE.h** --> `#define __feof_unlocked_body(_fp) (((_fp)->_flags & _IO_EOF_SEEN) != 0)` **fileops.c** _IO_file_xsgetn --> `count = _IO_SYSREAD (fp, s, count); if (count == 0) fp->_flags |= _IO_EOF_SEEN;` From this point on it really gets gibberish, macros, vtables etc... . **In summary**, _IO_SYSREAD returns -1 indicates EOF. **libioP.h** --> `#define _IO_SYSREAD(FP, DATA, LEN) JUMP2 (__read, FP, DATA, LEN)` – Janardhan B. Chinta Jun 11 '21 at 13:36
  • I don't know what your first comment is supposed to prove. It doesn't. It's not much good putting illegible code into comments. Fix your answer. It's wrong. – user207421 Jun 12 '21 at 02:14