I am trying to use multi-threading for encoding with Random Linear Network Coding (RLNC) to increase performance.
However, I have problem with the performance issue, my multi-thread solution is slower, much slower than the current non-threaded version. I have a suspension that it is the atomic
access on m_completed
and the std::mutex for inserting elements into m_results
which are killing my performance. I am however, not aware of how to confirm this.
So a bit more information the function completed()
is called in a while
-loop in the main thread while(!encoder.completed()){}
which results in a hell of a lot of atomic access, but I cannot find a proper way to do it without the atomic or mutex lock. You can find the code below.
So please if someone can see a mistake or guide me towards a better way of doing this I will be very grate full. I have speend 1.5 weeks on figuring out what the is wrong now, and my only idea is atomic
or std::mutex
locks
#include <cstdint>
#include <vector>
#include <mutex>
#include <memory>
#include <atomic>
...
namespace master_thesis
{
namespace encoder
{
class smart_encoder
{
...
void start()
{
...
// Incase there are more an uneven amount
// of symbols we adjust this abov
else
{
m_pool.enqueue([this, encoder](){
std::vector<std::vector<uint8_t>> total_payload(this->m_coefficients,
std::vector<uint8_t>(encoder->payload_size()));
std::vector<uint8_t> payload(encoder->payload_size());
for (uint32_t j = 0; j < this->m_coefficients; ++j)
{
encoder->write_payload(payload.data());
total_payload[j] = payload; //.insert(total_payload.begin() + j, payload);
}
this->m_mutex.lock();
this->m_result.insert(std::end(this->m_result),
std::begin(total_payload),
std::end(total_payload));
++(this->m_completed);
this->m_mutex.unlock();
});
}
}
}
bool completed()
{
return m_completed.load() >= (m_threads - 1);
}
std::vector<std::vector<uint8_t>> result()
{
return m_result;
}
private:
uint32_t m_symbols;
uint32_t m_symbol_size;
std::atomic<uint32_t> m_completed;
unsigned int m_threads;
uint32_t m_coefficients;
std::mutex m_mutex;
std::vector<uint8_t> m_data;
std::vector<std::vector<uint8_t>> m_result;
ThreadPool m_pool;
std::vector<std::shared_ptr<rlnc_encoder>> m_encoders;
};
}
}