I'm trying to write several long distributed arrays to a single file using MPI-I/O (OpenMPI implementation) with shared file pointer. I get the following error messages
lseek:Invalid argument
WRITE FAILED
I prepared a simplified code snippet to reproduce the issue.
long long globalUpperBnd = 2200000000;// more than size of int
long long average = globalUpperBnd/commSize;
long long length = (commRank == commSize-1) ? globalUpperBnd-(average*commRank) : average;
char *buf = new char[length];
... // fill the buffer
MPI_File file;
MPI_File_open(comm, "test.bin", MPI_MODE_CREATE|MPI_MODE_WRONLY, MPI_INFO_NULL, &file);
MPI_File_set_view(file, 0, MPI_BYTE, MPI_BYTE, "native", MPI_INFO_NULL);
MPI_File_write_ordered(file, buf, length, MPI_BYTE, MPI_STATUS_IGNORE);
// here I got an error message
MPI_File_write_ordered(file, buf, length, MPI_BYTE, MPI_STATUS_IGNORE);
MPI_File_close(&file);
delete []buf;
It looks like MPI_Offset
is just int
and the 2nd call of MPI_File_write_ordered
causes MPI_Offset
overflow, the offset becomes negative.
It's quite interesting that writing of the same amount of data can be done successfully by multiplying globalUpperBnd
by 2 and calling MPI_File_write_ordered
only one time. So it looks like MPI_File_write_ordered
avoids offset overflow somehow.
I use 64-bit OpenMPI library.
Is there any workaround for this case?