0

Before I ask my question, I am fully aware that leaving an input stream open can cause a memory leak, and therefore doing so is bad practice.

Consider the following preconditions:

  • Only a single file is needed to be read
  • The file in question is a text file which contains rows of data
  • This file is quite large: 50MB or more
  • The file is read many, many times during a test run

The reason I am asking is that in my test automation suite, the same file is required to be called over and over again to validate certain data fields.

In its current state, the data reader function opens a BufferedReader stream, reads/returns data, and then closes stream.

However, due to the file size and the number of times the file is read, I don't know if leaving the stream open would be beneficial. If I'm being honest, I don't know if the file size affects the opening of the stream at all.

So in summary, given the above listed preconditions, will leaving open a BufferedReader input stream improve overall performance? And is a memory leak still possible?

khelwood
  • 55,782
  • 14
  • 81
  • 108
Rusty Shackleford
  • 337
  • 1
  • 6
  • 18
  • 2
    How do you intend to read the file again, from the beginning, with a reader that you've already used to read a previous version of the file until the end? What takes time is to read the 50 MB of data. Not opening and closing a reader. If the file is read-only, read the file once and for all, instead of reading it again and again. – JB Nizet Mar 04 '18 at 12:55
  • From the beginning, with a reader class which has retained the input stream reference - I can't read the file just once, as it's part of a test Junit automation suite which dynamically calls the file depending upon the test being run – Rusty Shackleford Mar 04 '18 at 12:58
  • 2
    Then the obvious optimization is not to leave a buffered reader open. It's to read the file once, and store its content in memory. Note that, when opening a reder that reads from an InputStream, reading characters from the reader reads bytes from the underlying stream, and thus, once you've read all the characters, there is nothing to read anymore from the underlying input stream. – JB Nizet Mar 04 '18 at 12:59
  • If you want to know how something might affect performance, [benchmark](https://stackoverflow.com/questions/504103/how-do-i-write-a-correct-micro-benchmark-in-java). – Bernhard Barker Mar 04 '18 at 13:02
  • Ok, so the answer to my question is no. Can you make your comment a answer so i can accept it? – Rusty Shackleford Mar 04 '18 at 13:03
  • @Dukeling Sorry, this is for a client and i'm not on client site at the moment so cannot benchmark, but will do tomorrow. Its just something I've been thinking about over the weekend – Rusty Shackleford Mar 04 '18 at 13:04

2 Answers2

0

If you have enough memory to do this, then you will probably get best performance by reading the entire file into a StringBuilder, turning it into a String, and then repeatedly reading from the String via a StringReader.

However, you may need 6 or more times as many bytes of (free) heap space as the size of the file.

  • 2 x to allow for byte -> char expansion
  • 3 x because of the way that a StringBuilder buffer expands as it grows.

You can save space by holding the file in memory as as bytes (not chars), and by reading into a byte[] of exactly the right size. But then you need to repeat the bytes -> chars decoding each time you read from the byte[].

You should benchmark the alternatives if you need ultimate performance.

And look at using Buffer to reduce copying.


Re your idea. Keeping the BufferedReader open and using mark and reset would give you a small speedup compared with closing and reopening. But the larger your file is, the smaller the speedup is in relative terms. For a 50GB file, I suspect that the speedup would be insignificant.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
0

Yes, not closing a stream could improve performance in theory as the object will not trigger garbage collection assuming you're not de-referencing the BufferedReader. Also, the undelying resources won't need to be sync'd. See similar answer: Performance hit opening and closing filehandler?

However, not closing you BufferedReader will result in memory leak and you'll see heap increase.

I suggest as other's have in comments and answers to just read the file into a memory and use that. A 50MB file that isn't that much, plus the performance reading from a String once in memory will be much higher than re-reading a file.

Ray
  • 40,256
  • 21
  • 101
  • 138