1

I am trying to parallelize an operation using pthreads. The process looks something like:

double* doSomething( .... )  {   
 double* foo;   
 foo = new double[220];  

 for(i = 0; i<20; i++)  
 {  
  //do something with the elements in foo located between 10*i and 10*(i+2)  
 }  

 return foo; 
}

The stuff happening inside the for-loop can be done in any order, so I want to organize this using threads.

For instance, I could use a number of threads such that each thread goes through parts of the for-loop, but works on different parts of the array. To avoid trouble when working on overlapping parts, i need to lock some memory.

How can I make a mutex (or something else) that locks only part of the array?

Paal
  • 11
  • 1
  • 2
  • Did you really mean for the data sections to be overlapping? [0,20), [10,30), [20,40), ..., [190,210)? Or did you mean non-overlapping: for(i=0; i<20; ++i) { doSomething(foo, i*20, (i+1)*20);} which gives [0,20), [20,40), ..., [200,220). – Jesse Chisholm Feb 26 '14 at 00:38

3 Answers3

1

If you are using latest gcc you can try parallel versions of standard algorithms. See the libstdc++ parallel mode.

Maxim Egorushkin
  • 131,725
  • 17
  • 180
  • 271
1

If you just want to make sure that a section of the array is worked once...

Make a global variable:

int _iNextSection;  

Whenever a thread gets ready to operate on a section, the thread gets the next available section this way:

iMySection = __sync_fetch_and_add(&_iNextSection, 1);

__sync_fetch_and_add() returns the current value of _iNextSection and then increments _iNextSection. __sync_fetch_and_add() is atomic, which means __sync_fetch_and_add() is guaranteed to complete before another thread can do it. No locking, no blocking, simple, fast.

johnnycrash
  • 5,184
  • 5
  • 34
  • 58
0

If the loop looks exactly like you wrote, I would use an array of 21 mutexes and block in each thread on ith an (i + 1)th mutex on the beginning of the loop.

So something like:

...
for (i = 0; i < 20; i++) {
 mutex[i].lock();
 mutex[i+1].lock();
 ...
 mutex[i+1].unlock();
 mutex[i].unlock();
}

The logic is that only two neighboring loop executions can access same data (if the limits are [i * 10, (i + 2) * 10)), so you only need to worry about them.

Klark
  • 8,162
  • 3
  • 37
  • 61