5

I have read that BufferedOutputStream Class improves efficiency and must be used with FileOutputStream in this way -

BufferedOutputStream bout = new BufferedOutputStream(new FileOutputStream("myfile.txt"));

and for writing to the same file below statement is also works -

FileOutputStream fout = new FileOutputStream("myfile.txt");

But the recommended way is to use Buffer for reading / writing operations and that's the reason only I too prefer to use Buffer for the same.

But my question is how to measure performance of above 2 statements. Is their any tool or kind of something, don't know exactly what? but which will be useful to analyse it's performance.

As new to JAVA language, I am very curious to know about it.

Chaitanya Ghule
  • 451
  • 1
  • 5
  • 11

2 Answers2

4

Buffering is only helpful if you are doing inefficient reading or writing. For reading, it's helpful for letting you read line by line, even when you could gobble up bytes / chars faster just using read(byte[]) or read(char[]). For writing, it allows you to buffer pieces of what you want to send through I/O with the buffer, and to send them only on flush (see PrintWriter (PrintOutputStream(?).setAutoFlush())

But if you are just trying to read or write as fast as you can, buffering doesn't improve performance

For an example of efficient reading from a file:

File f = ...;
FileInputStream in = new FileInputStream(f);
byte[] bytes = new byte[(int) f.length()]; // file.length needs to be less than 4 gigs :)
in.read(bytes); // this isn't guaranteed by the API but I've found it works in every situation I've tried

Versus inefficient reading:

File f = ...;
BufferedReader in = new BufferedReader(f);
String line = null;
while ((line = in.readLine()) != null) {
  // If every readline call was reading directly from the FS / Hard drive,
  // it would slow things down tremendously. That's why having a buffer 
  //capture the file contents and effectively reading from the buffer is
  //more efficient
}
ControlAltDel
  • 33,923
  • 10
  • 53
  • 80
  • what do you mean by inefficient reading or writing? @ControlAltDel – Chaitanya Ghule Apr 20 '17 at 19:31
  • @user7876966 I have updated my answer with examples of efficient and inefficient reading – ControlAltDel Apr 20 '17 at 19:44
  • Means reading from a file which is already available in FS / HDD is inefficient and reading from cmd i.e. user input data is efficient. Is that all depending on this type of reading / writing we should switch between BufferedReader / BufferedWriter and FileInputStream / FileOutputStream @ControlAltDel – Chaitanya Ghule Apr 20 '17 at 20:12
  • @user7876966 No, you are not understanding what I've written – ControlAltDel Apr 20 '17 at 20:27
  • [This article](https://orangepalantir.org/topicspace/show/83) shows that the use of buffered streams reduces system calls, so in theory it should be always better to use them. – Marco Sulla Apr 28 '19 at 07:37
  • @MarcoSulla the buffering here is definitely improving performance. But this is because the algorithm uses inefficient I/o writing (DataOutStream.writeInt) repeatedly. Also, the writing is sporadic: only when a new prime is found. So In this case, it is definitely more efficient to collect these writes together and then writing them to disk together. Contrast this with writing the bytes of an image, where you already have all the images bytes in a byte[]. In this case, writing the byte[] to i/o won’t benefit from using a buffered writer – ControlAltDel Apr 28 '19 at 13:18
3

These numbers came from a MacBook Pro laptop using an SSD.

  • BufferedFileStreamArrayBatchRead (809716.60-911577.03 bytes/ms)
  • BufferedFileStreamPerByte (136072.94 bytes/ms)
  • FileInputStreamArrayBatchRead (121817.52-1022494.89 bytes/ms)
  • FileInputStreamByteBufferRead (118287.20-1094091.90 bytes/ms)
  • FileInputStreamDirectByteBufferRead (130701.87-956937.80 bytes/ms)
  • FileInputStreamReadPerByte (1155.47 bytes/ms)
  • RandomAccessFileArrayBatchRead (120670.93-786782.06 bytes/ms)
  • RandomAccessFileReadPerByte (1171.73 bytes/ms)

Where there is a range in the numbers, it varies based on the size of the buffer being used. A larger buffer results in more speed up to a point, typically somewhere around the size of the caches within the hardware and operating system.

As you can see, reading bytes individually is always slow. Batching the reads into chunks is easily the way to go. It can be the difference between 1k per ms and 136k per ms (or more).

These numbers are a little old, and they will vary wildly by setup but they will give you an idea. The code for generating the numbers can be found here, edit Main.java to select the tests that you want to run.

An excellent (and more rigorous) framework for writing benchmarks is JMH. A tutorial for learning how to use JMH can be found here.

Chris K
  • 11,622
  • 1
  • 36
  • 49