I wish to OpenMP reduce a large array into a smaller dynamic array. For example, where
large[] = {1, 1, 1, 2, 2, 2, 3, 3, 3};
// OpenMP reduce length-3 sublists of large, to produce:
small[] = {3, 6, 9};
My requirements are similar to this question, but I have important additional constraints:
- must support OpenMP 3.1 (so I cannot use OpenMP 4.5's array reductions, as this answer does)
- cannot have a thread-private copy of
small
(like this answer), sincesmall
can be just as large aslarge
, and hence thread-private copies may cause a stack overflow large
array must be iterated in order, for good caching (since it may be very large)
Some other details:
- the summed elements of
large
are not necessarily adjacent - this reduction happens within a function. The
small
(output) array is pre-allocated by the user, and passed by reference. Ideally, the function should not allocate a new local copy ofsmall
- both
small
andlarge
have lengths of a power of2
(e.g.2
,4
,8
,16
...). Every element ofsmall
is reduced from the same number oflarge
elements (another power of 2).
Here is example serial psuedocode, for clarity:
void myfunc(double* small, double* large, int lLen, // other args) {
for (int i=0; i<lLen; i++) {
int sInd = // determined by other args
small[sInd] += large[i];
}
}
Here is an example implementation, disqualified since it makes use of OpenMP 4.5's array reduction. It furthermore undesirably makes use of a local copy of small
(as required to use reduction
).
void myfunc(double* small, int sLen, double* large, int lLen, // other args) {
double smallLocal[sLen];
int i, sInd;
#pragma omp parallel private (i) for
for (i=0; i<sLen; s++)
smallLocal[i] = 0;
#pragma omp parallel private (i, sInd) reduction (+:smallLocal) for
for (i=0; i<largeLen; i++) {
sInd = // determined by other args
smallLocal[sInd] += large[i];
}
#pragma omp parallel private (i) for
for (i=0; i<sLen; s++)
small[i] = smallLocal[i];
}
I'm hoping I can achieve the same thing in OpenMP 3.1, possibly avoiding even the local copy of smallLocal
, by managing the array element reductions myself. How can I do this?