1

I am reading a set of files (~1.8 mil files, ~20KB each) into Strings (for further processing) using the following:

byte[] encoded = Files.readAllBytes(Paths.get(path));
return encoding.decode(ByteBuffer.wrap(encoded)).toString();

I am running into a strange problem that I don't understand. The first time that I run this process it takes about 10-15 seconds. On subsequent executions it takes about 1.5 seconds. If I try again in a few hours the same behavior happens. The first time it takes long and afterwards it runs very fast. This suggests some kind of caching (or something else), but I can't find an explanation for this behavior.

Any help would be appreciated. Thank you

DT7
  • 1,615
  • 14
  • 26
  • I think this is what OS is doing. If you load something, it remembers that "that" part of memory was load from a that file and if file or memory didnt change meanwhile, it uses already loaded files... – libik Oct 11 '13 at 16:47
  • This sounds like it's Java's JIT: in general, Java will interpret bytecode until it determines that the code is worth optimizing, at which point it will compile the bytecode to optimized machine code. It isn't an unusual occurrence that Java programs will be slow initially and speed up: it's _normal._ – Louis Wasserman Oct 11 '13 at 17:22
  • 1
    There’s nothing strange here: the OS does caching, but also the harddrive does powersave when not needed for some time, and there are various other reasons for such a behavior. On the Java side, classes are loaded when they are used the first time, yielding similar behavior. It would be surprising if it was different. – Holger Oct 11 '13 at 17:33
  • Thank you for the replies. I wonder how I should do proper performance testing. I was trying different libraries, and different ways to read files into strings to see which would be the best/fastest. The "more realistic" execution time is the "first one", but I don't have a good way to get to that state. Do you have any suggestions? Thank you for your help. I do appreciate it. – Maria Muslea Oct 11 '13 at 18:46
  • Look at these questions: http://stackoverflow.com/questions/11610180/how-to-measure-file-read-speed-without-caching http://stackoverflow.com/questions/14731701/how-to-measure-disk-speed-in-java-for-benchmarking – Timofey Gorshkov Oct 12 '13 at 09:11

0 Answers0