0

I am trying to compare performance of different lock-free queues, therefore, I want to create a unit test - which includes pushing/poping user-defined pre-built objects to and from the queue. Therefore, I want to ask you couple of questions:- 1) How to create pre-built objects in a simple manner. Does creating an array like I did would fulfill the purpose. 2) I am getting an error "terminate called after throwing an instance of 'std::system_error' what(): Invalid argument Aborted (core dumped)".

Thanx in advance.

#include <cstdlib>
#include <stdio.h>
#include <string>
#include <chrono>
#include <iostream>
#include <ctime>
#include <atomic>
#include <thread>
#include <boost/lockfree/queue.hpp>

using namespace std;

const long NUM_DATA = 10;
const int NUM_PROD_THREAD = 2;
const int NUM_CONSUM_THREAD = 2;
const long NUM_ITEM = 1000000;


class Data
{
public:
    Data(){}
    void dataPrint() {cout << "Hello";}
private:
    long i;
    double j;
};


Data *DataArray = new Data[NUM_DATA];
boost::lockfree::queue<Data*> BoostQueue(1000);

struct Producer
{
    void operator()()
    {
        for(long i=0; i<1000000; i++)
            BoostQueue.push( DataArray );
    }
};


struct Consumer
{
    Data *pData;
    void operator()()
    {
        while (  BoostQueue.pop( pData ) ) ;
    }
};


int main(int argc, char** argv)
{
    std::thread thrd [NUM_PROD_THREAD + NUM_CONSUM_THREAD];

    std::chrono::duration<double> elapsed_seconds;

    auto start = std::chrono::high_resolution_clock::now();
    for ( int i = 0; i < NUM_PROD_THREAD;  i++ )
    {
        thrd[i] = std::thread{ Producer() };
    }

    for ( int i = 0; i < NUM_CONSUM_THREAD; i++ )
    {
        thrd[NUM_PROD_THREAD+i] = std::thread{Consumer()};
    }

    for ( int i = 0; i < NUM_CONSUM_THREAD; i++ )
    {
        thrd[i].join();
    }

    auto end = std::chrono::high_resolution_clock::now();
    elapsed_seconds = end - start;
    std::cout << "Enqueue and Dequeue 1 million item in:" << elapsed_seconds.count() << std::endl;

    for ( int i = 0; i < NUM_PROD_THREAD; i++ )
    {
        thrd[i].join();
    }

    return 0;
}
Amit
  • 712
  • 2
  • 13
  • 26
user225008
  • 123
  • 2
  • 8
  • I can't see why you're crashing - try adding some trace or using a debugger. You're not using the `Data` array at all... just pushing and popping pointers to it. For benchmarking, perhaps just push successive numbers and get the consumers to add them up, getting the per-thread totals when you `join`. You could also start the numbers at `argc`, just so the compiler can't have any expectations. I like to see evidence of work in my benchmarks, just so I know code does what's intended, and can't have been optimised away or done at compile time. – Tony Delroy Aug 29 '14 at 07:47
  • I am getting the output as:- Enqueue and Dequeue 1 million item in:0.145872 terminate called after throwing an instance of 'std::system_error' what(): Invalid argument Aborted (core dumped). But the error is worrying me. Furthermore, will pushing numbers in the queue will have the same effect on the performance as that of user-defined object? – user225008 Aug 29 '14 at 10:23
  • Oh - the crash must be because you've already joined those same thread ids... you forgot to use `NUM_PROD_THREAD+` on the array index when joining the consumer threads. Your user-defined object array isn't currently being used during the production/consumption profiling, even the array element construction happens statically before the first line in `main()` executes. Pushing an `int` vs. pushing any kind of pointer will take similar time... doing a little adding won't add much compared to even lock-free queue operations. You could separately profile how long it takes to add in 1 thread. – Tony Delroy Aug 29 '14 at 10:35
  • Thank you! I spend the whole day trying to remove this error :D. Just one last question. I think I get what your saying about the pushing of int vs pointers. But If I want to push pre-built objects into the queue. How can I do that? I want to do it without relying on containers. A simpler approach would be helpful. – user225008 Aug 29 '14 at 10:46
  • You're welcome. Well, if the producer pushes a pointer to an object, then the consumer can get the pointer as you do... you just don't happen to use it and do all the creation outside the timed code, which is why I say you might as well have used `int`s for the purposes of benchmarking. In your real app, just `new` the objects one-by-one before `push`ing pointers to them, then when the consumer pops a pointer to the object it can process the object before calling `delete`. Is that what you mean? – Tony Delroy Aug 29 '14 at 11:08
  • I think it would be better If I point you to the link of the paper that has the benchmark that I want to implement. http://stackoverflow.com/questions/2945312/optimistic-lock-free-fifo-queues-impl. In section 4.1, the author states that "we use an array of nodes that are allocated in advance and these are pushed into the queue". Does it mean that the object data is pushed into the queue or just the pointer to the object? – user225008 Aug 30 '14 at 15:49
  • I had a quick read but couldn't see any discussion of what `data_type` could be, but my expectation is that you'd put small data items (up to the size of a pointer, as that's what can be handled atomically) directly into the structure, and anything larger would need to be handled by pointer. – Tony Delroy Sep 01 '14 at 02:33
  • Is it possible for you to show me how this can be done in the benchmark because I am getting a bit confused. Thank you. – user225008 Sep 01 '14 at 03:50

1 Answers1

0

Just illustrating how to use Data elements in the benchmark, though this does add a cout within the measured time which isn't ideal but probably isn't significant either.

class Data
{
public:
    Data(long i) : i_(i) {}
    void dataPrint() {cout << "Hello";}
private:
    long i_;
    double j;
};


Data* dataArray[1000000];
for (int i = 0; i < NUM_DATA; ++i) dataArray[i] = new Data(i);

boost::lockfree::queue<Data*> BoostQueue(1000);

struct Producer
{
    void operator()()
    {
        for(long i=0; i<1000000; i++)
            BoostQueue.push( dataArray[i] );
    }
};


struct Consumer
{
    Data *pData;
    long result_;
    void operator()()
    {
        result_ = 0;
        while (  BoostQueue.pop( pData ) )
            result_ += pData->i_;
        std::cout << result_ << '\n';
    }
};


int main(int argc, char** argv)
{
    std::thread thrd [NUM_PROD_THREAD + NUM_CONSUM_THREAD];

    std::chrono::duration<double> elapsed_seconds;

    auto start = std::chrono::high_resolution_clock::now();
    for ( int i = 0; i < NUM_PROD_THREAD;  i++ )
        thrd[i] = std::thread{ Producer() };

    for ( int i = 0; i < NUM_CONSUM_THREAD; i++ )
        thrd[NUM_PROD_THREAD+i] = std::thread{Consumer()};

    for ( int i = 0; i < NUM_CONSUM_THREAD; i++ )
        thrd[NUM_PROD_THREAD+i].join();

    auto end = std::chrono::high_resolution_clock::now();
    elapsed_seconds = end - start;
    std::cout << "Enqueue and Dequeue 1 million item in:"
        << elapsed_seconds.count() << std::endl;

    for ( int i = 0; i < NUM_PROD_THREAD; i++ )
        thrd[i].join();
    for (int i = 0; i < 1000000; ++i)
        delete dataArray[i];
}
Tony Delroy
  • 102,968
  • 15
  • 177
  • 252
  • Thank you for the illustration. However, I have noticed two things, one is that their is not much of a difference in execution time of two implementation (previous and this one). Secondly, I am getting Segmentation fault if I used result_ += pData->i_ changing i_ (to public) – user225008 Sep 01 '14 at 05:51
  • @user225008: the idea is to do some really trivial work so it doesn't change the execution time much, but does at least illustrate accessing the object that the queue has passed pointers to so it's indicative of minimal "real" work. For the segfault - you might reduce the loop and data from 1 million to say 10, and print (`cout`) the `pData` values being pushed and popped to ensure it's successfully popping valid pointers. – Tony Delroy Sep 01 '14 at 06:17