I came across an OpenMP code that had the collapse clause, which was new to me. I'm trying to understand what it means, but I don't think I have fully grasped it's implications; One definition that I found is:
COLLAPSE: Specifies how many loops in a nested loop should be collapsed into one large iteration space and divided according to the schedule clause. The sequential execution of the iterations in all associated loops determines the order of the iterations in the collapsed iteration space.
I thought I understood what that meant, so I tried the follwoing simple program:
int i, j;
#pragma omp parallel for num_threads(2) private(j)
for (i = 0; i < 4; i++)
for (j = 0; j <= i; j++)
printf("%d %d %d\n", i, j, omp_get_thread_num());
Which produced
0 0 0
1 0 0
1 1 0
2 0 0
2 1 0
2 2 1
3 0 1
3 1 1
3 2 1
3 3 1
I then added the collapse(2)
clause. I expected to have the same result in the first two columns but now have an equal number of 0
's and 1
's in the last column.
But I got
0 0 0
1 0 0
2 0 1
3 0 1
So my questions are:
- What is happening in my code?
- Under what circumstances should I use
collapse
? - Can you provide an example that shows the difference between using
collapse
and not using it?