In C++, I want to create an algorithms with the following structure:
- A sequential part
- A parallel part A
- A sequential part
- A parallel part B
- A sequential part
Using pthreads, I can think of two ways to solve the problem:
- Create N threads for part A and then destructing these threads after part A is finished. Then allocating N new threads for part B.
- Using the same threads for part A and part B using the various kinds of synchronization methods available.
How much overhead does it take to create new threads for solution 1 when performance matters. Should I go for solution 1 or solution 2?