I have a simple thread object which takes care of some execution (a worker):
In the simplest form, an object is created for every single thread:
class worker
{
public:
worker (
boost::atomic<int> & threads,
boost::mutex & mutex,
boost::condition_variable & condition
)
: threads__(threads), mutex__(mutex), condition__(condition)
{}
void run (
// some params
)
{
// ... do the threaded work here
// finally, decrease number of running threads and notify
boost::mutex::scoped_lock lock(mutex__);
threads__--;
condition__.notify_one();
}
private:
boost::atomic<int> & threads__;
boost::mutex & mutex__;
boost::condition_variable & condition__;
};
The way I use it is within a loop which runs maximum 8 concurrent threads, and waits to be notified if one has finished, in order to spawn the next one:
boost::thread_group thread_group;
boost::mutex mutex;
boost::condition_variable condition;
boost::atomic<int> threads(0);
// Some loop which can be parallelised
for ( const auto & x : list )
{
// wait if thread_count exceeds 8 threads
boost::mutex::scoped_lock lock(mutex);
while ( threads >= 8 )
condition.wait( lock );
// Create worker object
worker _wrk_( threads, mutex, condition );
boost::thread * thread = new boost::thread( &worker::run, &_wrk_, /* other params */ );
thread_group.add_thread( thread );
threads++;
}
This works for most of my scenarios, but now I have a thread object which I need to re-use.
The reason is simple: This tread object contains thrust::device_vector<float>
which are expensive to (re)allocate when the object is removed.
Furthermore, those vectors can be re-used as most of their contents won't be changed.
Therefore, I am looking for a mechanism which could re-use the objects created within the loop - in fact, I will allocate 8 of those objects (or as many as my concurrent threads) before hand, and then use them over and over again. What I am hoping can be done, is something such as this:
boost::thread_group thread_group;
boost::mutex mutex;
boost::condition_variable condition;
boost::atomic<int> threads(0);
// our worker objects to be reused
std::vector<std::shared_ptr<worker>>workers(8,std::make_shared<worker>(threads,mutex,condition));
// Some loop which can be parallelised
for ( const auto & x : list )
{
// wait if thread_count exceeds 8 threads
boost::mutex::scoped_lock lock(mutex);
while ( threads >= 8 )
condition.wait( lock );
// get next available thread object from the vector
auto _wrk_ = std::find_if(workers.begin(), workers.end(), is_available() );
// if we have less than 8 threads but no available thread object
if ( _wrk_ == workers.end() ) throw std::runtime_error ("...");
// Use the first available worker object for this thread
boost::thread * thread = new boost::thread(&worker::run, &(*_wrk_));
thread_group.add_thread( thread );
threads++;
}
I don't know how to signal the is_available(), other than implementing it as a class method (of the worker class).
Second, this appears to me to be too complex for no reason, I'm sure there has to be some kind of other pattern I can use which is more simple and/or elegant.