Short answer: best solution is the one that you have dismissed:
- Use
File.listFiles()
(or equivalent) to iterate over the files in each directory.
- Use recursion for nested directories.
Lets start with the performance issue. When you are uploading a large number of individual files to cloud storage, the performance bottleneck is likely to be the network and the remote server:
Unless you have an extraordinarily good network link, a single TCP stream won't transfer data anywhere like as fast as it can be read from disk (or written at the other end).
Each time you transfer a file, there is likely to be an overhead for starting the new file. The remote server has to create the new file, which entails adding a directory entry, and inode to hold the metadata, etc.
Even on the sending side, the OS and disc overheads of reading directories and metadata are likely to dominate the Java overheads.
(But don't just trust what I say ... measure it!)
The chances are that the above overheads will be orders of magnitude greater than you can get by tweaking the Java-side file traversal.
But ignoring the above, I don't think that using the Java 8 Stream
paradigm would help anyway. AFAIK, there are no special high performance "adapters" for applying streams to directory entries, so you would most likely end up with a Stream wrapper for the result of listFiles()
calls. And that would not improve performance.
(You might get some benefit from parallel streams, but I don't think you will get enough control over the parallelism.)
Furthermore, you would need to deal with the fact that if your Java 8 Stream
produces InputStream
or similar handles, then you need to make sure that those handles are properly closed. You can't just close them all at the end, or rely on the GC to finalize them. If you do either of those, you risk running out of file descriptors.