2

I have code which deletes all of the files in a directory except for the last n most recently modified ones. The code gets a list of File objects from a directory, sorts them using a comparator which looks at File.lastModifedTime(), then deletes the appropriate sublist.

When we upgraded to Java 7, the program started throwing java.lang.IllegalArgumentException: Comparison method violates its general contract!. I suspect that this is because a file is modified (normal behavior) before sorting is completed so the comparator is returning inconsistent values as it check each file's last modified time.

My question is, how would you solve this problem and delete the right files?

One suggestion I read was to store the file and it's last modified time in a map before sorting so when comparisons are done, the last modified time is looked up from the map. However, if the file changed mid sort, the map isn't updated so wouldn't you end up deleting the wrong file?

Other idea I thought of was using Java NIO's file watch to keep a sorted list and re-sort whenever a file changes. But that seems fairly complicated.

I also thought of a brute force method of wrapping the sort in a try-catch statement and just retry the whole sort if it runs into the comparison method violation.

Lastly, I could just set the java.util.Arrays.useLegacyMergeSort property and go back to the silently ignore ways of Java 6.

Wei
  • 207
  • 3
  • 6
  • 1
    You've got a natural race condition: what do you expect to happen if your code decides to delete a file, and then it gets modified *just* before you call `delete`? If that's a problem, no amount of sorting is going to help. – Jon Skeet Dec 06 '13 at 18:32

2 Answers2

1

Taken from my answer to the more generic question Best way to list files in Java, sorted by Date Modified?

Java 8+

private static List<Path> listFilesOldestFirst(final String directoryPath) throws IOException {
    try (final Stream<Path> fileStream = Files.list(Paths.get(directoryPath))) {
        return fileStream
            .map(Path::toFile)
            .collect(Collectors.toMap(Function.identity(), File::lastModified))
            .entrySet()
            .stream()
            .sorted(Map.Entry.comparingByValue())
//            .sorted(Collections.reverseOrder(Map.Entry.comparingByValue()))  // replace the previous line with this line if you would prefer files listed newest first
            .map(Map.Entry::getKey)
            .map(File::toPath)  // remove this line if you would rather work with a List<File> instead of List<Path>
            .collect(Collectors.toList());
    }
}

Java 7

private static List<File> listFilesOldestFirst(final String directoryPath) throws IOException {
    final List<File> files = Arrays.asList(new File(directoryPath).listFiles());
    final Map<File, Long> constantLastModifiedTimes = new HashMap<File,Long>();
    for (final File f : files) {
        constantLastModifiedTimes.put(f, f.lastModified());
    }
    Collections.sort(files, new Comparator<File>() {
        @Override
        public int compare(final File f1, final File f2) {
            return constantLastModifiedTimes.get(f1).compareTo(constantLastModifiedTimes.get(f2));
        }
    });
    return files;
}
Matthew Madson
  • 1,643
  • 13
  • 24
0

My question is, how would you solve this problem and delete the right files?

I suggest you take the timestamps of all the files and cache them. Sort using these cached times. This way they will be sorted using a consistent timing.

File[] files = ...
final Map<File, Long> modified = new HashMap<File, Long>();
for(File file: files)
    modified.put(file, file.lastModified());
Arrays.sort(files, /* Comparator using 'modified' Map */);
Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • I think TreeSet with a comparator based on LastModifiedTime will be a better fit for this problem. http://docs.oracle.com/javase/7/docs/api/java/util/TreeSet.html – Salil Dec 06 '13 at 21:04
  • I thought about that (first idea in my question). This would prevent the error from occurring but could delete the wrong file if it's modified during the sort. – Wei Dec 06 '13 at 21:40
  • @Wei You always run risk the possibility you might want to modify a file after you delete it. In your case, you can check the files again just before you delete them. – Peter Lawrey Dec 07 '13 at 10:22
  • 2
    @Salil TreeSet doesn't allow duplicates and some file may be modified in the same milli-seconds. – Peter Lawrey Dec 07 '13 at 10:22
  • Good example on how to do this is provided here: http://stackoverflow.com/a/4248059/314089 – icyerasor Jan 28 '15 at 19:00