I've been working in a C++ code to perform quantum chemistry, atomic and molecular tasks in which is implied lots of work with arrays (1D, 2D, 3D etc). And I have an entire class called array
to handle this. Of course, from the very begining the most fundamental member functions are those to dynamically allocate memory for those arrays, to resize, or delete them.
data = new double **[row]();
#pragma omp for schedule(static) nowait
for(unsigned int i = 0; i < row; ++i)
{
data[i] = new double *[column]();
}
Now what I am doing is to speed up these routines with OpenMP. For most part of the routines I've been using the schedule(static) nowait
clasues to divide my loops into chunks of step/threads
, since those chunks spends almost the same time to be handled by their thread.
But to loops like the one above, with several calls of the new
operator, I have the (bad) feeling that chunks of theses loops do not takes the same time to execute in their threads, in the sense that I should consider to apply schedule(dynamic, chunk_size)
instead.
Do you guys agree? the dynamic allocation is not such simple task and could be expensive, so chunks of dynamic allocations could differ on its time of execution.
Actually I am not sure if I am not doing any kind of mistake concerning stack fragmentation or things like that. Advices would be welcome.
PS.: I am using the nowait
clause to try to minimize bottlenecks of implicit barriers.