I am searching for an efficient way to iterate over thousands of files in one or more directories.
The only way to iterate over files in a directory seems to be File.list*()
functions. These functions effectively load the entire list of files in some sort of Collection and then let the user iterate over it. This seems to be impractical in terms of time/memory consumption. I tried looking at commons-io and other similar tools. but they all ultimately call File.list*()
somewhere inside. JDK7's walkFileTree()
came close, but I don't have control over when to pick the next element.
I have over 150,000 files in a directory and after many -Xms/-Xmm trial runs I got rid of memory overflow issues. But the time it takes to fill the array hasn't changed.
I wish to make some sort of an Iterable class that uses opendir()/closedir() like functions to lazily load file names as required. Is there a way to do this?
Update:
Java 7 NIO.2 supports file iteration via java.nio.file.DirectoryStream. It is an Iterable class. As for JDK6 and below, the only option is File.list*()
methods.