4

My problem: I'm just trying to write a function that goes through a directory and gets the time that the most recent file was modified. What I mean by this is there is a directory of say ~2500 .5 megabyte files. I don't care when each file was modified, I just need the most recent modified timestamp. i.e. File 1 - 1392567840 File 2 - 1392567841 File 2403 - 1392567849 File 7 - 1392567850

In this case, File 7 would be the most recently modified timestamp. What I have found however is using the File.lastModified() is very slow, which is also posted in another stackoverflow post, but for use with file copies.

So for some prior research: File.lastModified() painfully slow!

I did search this forum and read suggestions to use multiple threads to do all of the last modified lookups, however I was wondering if there was a simpler way to do it as I just need the most recently modified value (so I think its a much simpler problem, however I'm not sure if I can avoid the individual lookups :( ).

Some code for investigation:

public static void main(String[] args) {
    // TODO Auto-generated method stub

    long start;
    long end;
    long average = 0;
    int cnt = 10;
    for(int i = 0; i < cnt; i++) {
        start = System.nanoTime();
        System.out.println(GetLastModifiedTimeOfFiles("./data/"));
        end = System.nanoTime();
        average += (end - start);
    }
    System.out.println("The average time it took was: " + (average/(cnt * 1.0))/1000000 + " ms to complete!");

}

public static long GetLastModifiedTimeOfFiles(String path) {
      File folder = new File(path);
      File[] listOfFiles = folder.listFiles(); 
      StringBuilder sb = new StringBuilder();
      for(int i = 0; i < listOfFiles.length; i++) {
          sb.append(listOfFiles[i].getName() + " ");
      }
      System.out.println(sb.toString().length() + " chars");

      long time = 0;

      return time;
}

Results:

Average of 22 ms when using .getName()

Average of 631 ms when using .lastModified()

So 1) Why would .getName() perform ~30x faster than .lastModified()?

2) Is there some alternative approach as I just need the single most recently modified file timestamp?

3) Is there a way I could just get an array returned of all the modified timestamps for the files in the directory as opposed to individual lookups?

Once again, I know that there was a similar post, so I hope its not a problem as I decided that post was too old from 2010. My apologies if I am in violation of anything.

Thank you for all your help!

Community
  • 1
  • 1
user2891729
  • 137
  • 3
  • 10
  • Since you just need to read the last modified date, you can probably read it concurrently. File operations have some overhead and that might improve the performance. You could create a `Runnable` instance with a `run()` method which reads `lastModified()` (and compares, reads, anything you can do concurrently). Then you read each file in a loop and run the instance in a thread. (You can create an `ExecutorService e = Executors.newCachedThreadPool();` and then run for each file `e.execute( runnableInstance );`. – helderdarocha Feb 16 '14 at 17:14
  • I do not think you can do something about it. If you need queries for file system then you need a special file system with query support or you have to store file metadata in DB. – jbaliuka Feb 16 '14 at 18:18
  • 1
    Out of curiosity: Since you are accessing the file system and there is a bunch of native code in between, what operating system and what file system did you use for your micro benchmark? – Hendrik Jul 31 '14 at 16:02

0 Answers0