TLDR
For loop hangs when I make files in parallel. Why? (see code below) Also, what's a safe/efficient way to write to multiple binary files (pointer and offset determined by iteration variable)?
Context and questions:
What I would like my code to do is the following:
(1) All processes read a single binary file containing a matrix of doubles -> already achieved this using MPI_File_read_at()
(2) For each 'column' of input data, perform calculations using the numbers in each 'row', and save the data for each column into its own binary output file ("File0.bin" -> column 0)
(3) To enable the user to specify an arbitrary number of processes, I use simple indexing to treat the matrix as one long (rows)X(cols) vector, and split that vector by the number of processes. Each process gets (rows)X(cols)/tot_proc of entries to process... using this approach, the columns will not be neatly divided by each process, therefore, each process needs to access whatever file(s) correspond to it and, using proper offsets, write to the correct section of the correct file. At the moment, it does not matter that the resulting file will be fragmented.
As I work toward that goal, I have written a short program to create binary files in a loop, but the loop hangs on the very last iteration (13 files divided over 4 processes). Number of files to create = (rows).
Question 1 Why does this code hang at the very end of the loop? In my toy example of 4 processes, id_proc 1-3 have 3 files to create, while id_proc 0 (the root process) has 4 files to create. The loop hangs when the root process tries to make it's 4th file. Note: I'm compiling this on a laptop running Ubuntu using mpic++.
Question 2 Eventually I will add a second for loop just like the one you see below, except in this loop, the process must write to the appropriate section of the binary files that have already been created. I plan to use MPI_File_write_at() to do this, but I have also read that the files should be statically sized using MPI_File_set_size(), and then, every process should have it's own view of the file using MPI_File_set_view(). So, my question is, in order for this to work, should I do the following?
(Loop 1) MPI_File_open(...,MPI_WRONLY | MPI_CREATE,...), MPI_File_set_size(), MPI_File_close()
(Loop 2) MPI_File_open(...,MPI_WRONLY,...), MPI_File_set_view(), MPI_File_write_at(), MPI_File_close()
.... Loop 2 seems like it will be slowed by having to open and close files each iteration, but I do not know in advance how much input data the user will provide, nor how many processes the user will provide. For example, Process N might need to write to the end of file 1, the middle of file 2, and the end of file 8. In principle, all of that can be taken care of with offsets. What I don't know is whether MPI allows for this level of flexibility or not.
Code attempting to create multiple files in parallel:
#include <iostream>
#include <cstdlib>
#include <stdio.h>
#include <vector>
#include <fstream>
#include <string>
#include <sstream>
#include <cmath>
#include <sys/types.h>
#include <sys/stat.h>
#include <mpi.h>
using namespace std;
int main(int argc, char** argv)
{
//Variable declarations
string oname;
stringstream temp;
int rows = 13, cols = 7, sz_dbl = sizeof(double);
//each binary file will eventually have 7*sz_dbl bytes
int id_proc, tot_proc, loop_min, loop_max;
vector<double> output(rows*cols,1.0);//data to write
//MPI routines
MPI_Init(&argc,&argv);//initialize MPI
MPI_Comm_rank(MPI_COMM_WORLD,&id_proc);//get "this" node's id#/rank
MPI_Comm_size(MPI_COMM_WORLD,&tot_proc);//get the number of processors
//MPI loop variable assignments
loop_min = id_proc*rows/tot_proc + min(rows % tot_proc, id_proc);
loop_max = loop_min + rows/tot_proc + (rows % tot_proc > id_proc);
//File handle
MPI_File outfile;
//Create binary files in parallel
for(int i = loop_min; i < loop_max; i++)
{
temp << i;
oname = "Myout" + temp.str() + ".bin";
MPI_File_open(MPI_COMM_WORLD, oname.c_str(), MPI_MODE_WRONLY | MPI_MODE_CREATE, MPI_INFO_NULL, &outfile);
temp.clear();
temp.str(string());
MPI_File_close(&outfile);
}
MPI_Barrier(MPI_COMM_WORLD);//with or without this, same error
MPI_Finalize();//MPI - end mpi run
return 0;
}
Tutorial/information pages I've read so far:
http://beige.ucs.indiana.edu/B673/node180.html
http://beige.ucs.indiana.edu/B673/node181.html
http://mpi-forum.org/docs/mpi-2.2/mpi22-report/node305.htm
https://www.open-mpi.org/doc/v1.4/man3/MPI_File_open.3.php
http://www.mcs.anl.gov/research/projects/mpi/mpi-standard/mpi-report-2.0/node215.htm