0

I am using BufferedReader (br) and BufferedWriter (bw) to read a very large file, do calculations, and write the results on the output file. The output file will have the same number of lines as in the input file.

What I am currently doing is getting data from the input file line by line (while br.readLine() != null), do the calculations of the line I just read, then write the result of this single calculation onto the output file (br.write) then write a new line (bw.newLine()). This loop repeats until it reaches the end of the file.

This works not bad.. but it takes me 1 second to process 3500 input lines. I have been told that it is two much as the code will be tested with MUCH LARGER files. What is the best practice I should use (both read and write)? Can I keep my results as chunks in the buffer until specific limit an then write to the actual file?

EDIT: I think the reading/performing calculation part is good, but how about writing by keeping parts in the buffer then writing to the output file? Is there a good way to avoid writing every iteration?

Yano
  • 15
  • 6
  • 5
    A couple questions: 1) Are you sure the time is spent reading and writing the data, rather than doing the calculations? E.g. if you just copy the data from one file to another without doing any other work, does it still take as long? 2) How are you timing this and does it scale linearly? Maybe you're including the time it takes your program to start, which is a constant cost not related to the size of the file. I'd suggest trying a file twice as long to see if it takes twice the time or not. – user94559 Jul 15 '16 at 19:45
  • That's the best practice. There's a question about it around here somewhere... – 4castle Jul 15 '16 at 19:45
  • Possible duplicate of [How to read a large text file line by line using Java?](http://stackoverflow.com/questions/5868369/how-to-read-a-large-text-file-line-by-line-using-java) – 4castle Jul 15 '16 at 19:48
  • @smarx it is the total time for the whole program. I just set two timers and print the time difference to learn how much it is taking. A double file is also taking almost double the time. – Yano Jul 15 '16 at 19:52
  • @4castle how about writing after keeping chunks in a buffer or something like this? – Yano Jul 15 '16 at 19:58
  • The calculations are not soft.. it involves java collections and maps, but I optimized the calculation part carefully. – Yano Jul 15 '16 at 20:00
  • Even though the calculations are optimized, they still keep the read process from going as fast as possible; this time adds up. Can you read the data, let another thread do the calcs and another writing the results back? – ChiefTwoPencils Jul 15 '16 at 20:05
  • @ChiefTwoPencils No I am not allowed to use multi-threading. And yes sure, the calculations take time and memory. I am just wondering how I can optimize the I/O part. Especially the output. – Yano Jul 15 '16 at 20:07
  • Have you tried different buffer sizes for your BufferedReader and BufferedWriter, and what was the outcome? Without seeing some code, I don't see what else can be optimized – Lolo Jul 16 '16 at 10:06

0 Answers0