OutputStream vs BufferedOutputStream

Question

In java 8, is there any real difference between:

try (OutputStream os = Files.newOutputStream(path)) {
    [...]
}

and

try (OutputStream os = new BufferedOutputStream(Files.newOutputStream(path))) {
    [...]
}

I read this SO question and answers, but it confused me a lot.

PS: something is changed in Java 11?

Holger · Answer 1 · 2019-04-29T07:23:35.117

As explained in this answer, a buffered stream is supposed to reduce the number of system calls. This is only relevant, if the application makes a lot of small read or write requests, resulting in a lot of system calls. This is what the linked answer means by “inefficient”.

By using a significantly larger buffer which can be read or written with a single call and fulfilling application requests by copying from or to the buffer, you’re reducing the number of system calls. This improves the performance, if the saved system calls are more expensive than the introduced copying overhead.

So the reason why it is not always better to use a buffered stream, is that not every application is making such small requests. When an applications makes reasonably sized requests, the best, the buffered stream can do, is going out of the way, so if it has an empty buffer and the application makes a request of the same or even bigger size than the buffer’s size, the buffered stream will pass the request to the source stream directly.

But if the application’s buffer is only slightly less, the buffered stream will do its job of buffering, introducing additional copying overhead. But as said, you only gain an advantage, if you actually save system calls and, depending on the architecture, you might have to say, “…if you actually save a significant amount of system calls”. A larger buffer is not an improvement per se.

A simple example would be, say, you just want to write 1,000 bytes to a file, like

byte[] data = /* something producing an array of 1,000 bytes */
try (OutputStream os = Files.newOutputStream(path)) {
    os.write(data);
}

So, if you wrap the output stream in a BufferedOutputStream, you’ll get a buffer of the default size of 8192 bytes. Since this buffered stream has no knowledge of how much you are going to write totally, it will copy the request data to its buffer, as it is smaller, to be flushed (written) during the close operation. So in the end, you don’t save any system call, but get the copying overhead.

So a buffered stream is not always more efficient; the buffering may even degrade the performance in some cases. Also, sometimes an application is not interested in the highest performance, but in timely writing to the underlying media. It is easier to wrap an OutputStream, to get buffering when needed, than to opt out the buffering, if the stream is already a BufferedOutputStream.

When you look at the NIO Channel API, introduced in JDK 1.4, you’ll notice that there are no buffered channels. Instead, it doesn’t offer methods for read or writing a single byte, further, it forces the programmers to use a ByteBuffer, to guide them to separate I/O and processing of the data. This is the preferred way to go.

Mh. `java.nio` seems extremely complicated. Is there a really advantage to use it instead classical `java.io`? Some benchmark, anything? — Marco Sulla, Apr 29 '19 at 22:20
NIO is not complicated. It has more features, a lot of things, the old `java.io` doesn’t have, so there’s no comparison at all. For the comparable stuff, NIO is often even easier than the old API. For certain tasks involving large data, there can be a significant performance advantage. But regardless of performance, once you needed a feature missing in `java.io`, you’ll find that using NIO consistently is easier than jumping back and forth between the APIs. — Holger, Apr 30 '19 at 06:44

score 5 · Accepted Answer · answered Apr 27 '19 at 22:44

5

The difference is that while an unbuffered is making a write call to the underlying system everytime you give it a byte to write, the buffered output stream is storing the data to be written in a buffer, making the system call to write the data only after calling the flush command. This is to improve performance by reducing I/O Operations called.

https://docs.oracle.com/javase/8/docs/api/java/io/BufferedOutputStream.html https://docs.oracle.com/javase/8/docs/api/java/io/OutputStream.html

answered Apr 27 '19 at 22:44

maxap

90
5

Sounds like it always better to use a `BufferedOutputStream`. Why is there not a simpler way to create it? – Marco Sulla Apr 28 '19 at 07:04
Furthermore, I don't understand the difference between efficient and inefficient reading in the SO answer I linked – Marco Sulla Apr 28 '19 at 07:11

OutputStream vs BufferedOutputStream

2 Answers2