I am using OpenMP and MPI to parallelize some matrix operations in c. Some of the functions operating on the matrix are written in Fortran. The Fortran functions require a buffer array to be passed in which is only used internally in the function. Currently I am allocating buffers in each parallel section similar to the code below.
int i = 0;
int n = 1024; // Actually this is read from command line
double **a = createNbyNMat(n);
#pragma omp parallel
{
double *buf;
buf = malloc(sizeof(double)*n);
#pragma omp for
for (i=0; i < n; i++)
{
fortranFunc1_(a[i], &n, buf);
}
free(z);
}
// Serial code and moving data around in the matrix a using MPI
#pragma omp parallel
{
double *buf;
buf = malloc(sizeof(double)*n);
#pragma omp for
for (i=0; i < n; i++)
{
fortranFunc2_(a[i], &n, buf);
}
free(z);
}
// and repeat a few more times.
I know reallocating the buffers can be avoided using a method similar to the code below, but I was curious if there is an easier way or some built in functionality in OpenMP for handling this. It would be nice to be able to compile the code without a lot of compiler directives whether or not OpenMP is present on the system we are compiling for.
double **buf;
buf = malloc(sizeof(double*) * num_openmp_threads);
int i = 0;
for (i = 0; i < num_openmp_threads; ++i)
{
buf[i] = malloc(sizeof(double) * n);
}
// skip ahead
#pragma omp for
for (i=0; i < n; i++)
{
fortranFunc1_(a[i], &n, buf[current_thread_num]);
}