For high performance computing applications with parallel I/O onto Lustre file systems, does file-per-process output give the upper limit to performance? I had always used HDF5, assuming it was some sort of high performance library, until I realized how terrible the parallel I/O performance was compared to file-per-process for my specific application. Sure, file-per-process is not as beautiful, and may require some (cheap) postprocessing to get into a useful format, but after wasting so much time trying to optimize HDF5 and getting terrible performance in the end I am wondering why anyone would use such a library for parallel I/O for high performance computing. What is wrong with file-per-process output and why is it common to discourage it? For bandwidth, is there any way to beat it?
Asked
Active
Viewed 42 times
0
-
IDK what you call "file-per-process", but performance is dependent of the target file system and the target library used by the application. You should not use the default NFS file system located in your home as it does not scale and slow down all other users. The temporary file system should always scale better than the parallel one since each machine store data locally (typically in memory) in this case (but this is not safe). – Jérôme Richard Oct 17 '22 at 08:47
1 Answers
0
optimizing I/O is very code dependant and even use case dependant with the same code.
Generally speaking, "file per process" approach is discouraged when the number of file created by each process become large since it can introduce latency on the MDS (metadata server). The other problems is also the number of inode (the number of files you can create, on super computer for example, user can not create an unlimited amount of files). Still, it can achieved high I/O throughput compare to parallel under a certain number of process.
I recommand you to take a look on that documentation PRACE-parallel I/O, section 2 and 3.

Squirrel
- 23
- 2