I am trying to understand why OpenMP works the way it does in the following example.
#include <omp.h>
#include <iostream>
#include <vector>
#include <stdlib.h>
void AddVectors (std::vector< double >& v1,
std::vector< double >& v2) {
size_t i;
#pragma omp parallel for private(i)
for (i = 0; i < v1.size(); i++) v1[i] += v2[i];
}
int main (int argc, char** argv) {
size_t N1 = atoi(argv[1]);
std::vector< double > v1(N1,1);
std::vector< double > v2(N1,2);
for (size_t i = 0; i < N1; i++) AddVectors(v1,v2);
return 0;
}
I first compiled the code above without enabling OpenMP (by omitting -fopenmp on the compiling flags). The execution time for N1 = 10000 was 0.1s. Enabling OpenMP makes the execution time go beyond 1 minute. I stopped it before it was done (got tired of waiting...).
I am compiling the code as below:
g++ -std=c++0x -O3 -funroll-loops -march=core2 -fomit-frame-pointer -Wall -fno-strict-aliasing -o main.o -c main.cpp
g++ main.o -o main
Not all these flags are necessary here but I'm using them on the project I'm trying to parallelize and I use those flags there. That's why I decided to leave them here. Also, I add -fopenmp to enable OpenMP on the compilation.
Does anybody know what's going wrong? Thank you!