1

EDIT: I've fixed the insertion. As Blastfurnace kindly mentioned the insertion invalidated the iterators. The loop is needed I believe to compare performance (see my comment on Blastfurnance's answer). My code is updated. I have completely similar code for the list just with vector replaced by list. However, with the code I find that the list performs better than the vector both for small and large datatypes and even for linear search (if I remove the insertion). According to http://java.dzone.com/articles/c-benchmark-%E2%80%93-stdvector-vs and other sites that should not be the case. Any clues to how that can be?

I am taking a course on programming of mathematical software (exam on monday) and for that I would like to present a graph that compares performance between random insertion of elements into a vector and a list. However, when I'm testing the code I get random slowdowns. For instance I might have 2 iterations where inserting 10 elements at random into a vector of size 500 takes 0.01 seconds and then 3 similar iterations that each take roughly 12 seconds. This is my code:

    void AddRandomPlaceVector(vector<FillSize> &myContainer, int place)  {
    int i = 0;

    vector<FillSize>::iterator iter = myContainer.begin();

    while (iter != myContainer.end())
    {
        if (i == place)
        {
            FillSize myFill;
            iter = myContainer.insert(iter, myFill);
        }
        else
            ++iter;

        ++i;
    }

    //cout << i << endl;
}

double testVector(int containerSize, int iterRand)
{
    cout << endl;
    cout << "Size: " << containerSize << endl << "Random inserts: " << iterRand << endl;

    vector<FillSize> myContainer(containerSize);
    boost::timer::auto_cpu_timer tid;

    for (int i = 0; i != iterRand; i++)
    {
        double randNumber = (int)(myContainer.size()*((double)rand()/RAND_MAX));
        AddRandomPlaceVector(myContainer, randNumber);
    }

    double wallTime = tid.elapsed().wall/1e9;

    cout << "New size: " << myContainer.size();

    return wallTime;
}

int main()
{
    int testSize = 500;
    int measurementIters = 20;
    int numRand = 1000;
    int repetionIters = 100;


    ofstream tidOutput1_sum("VectorTid_8bit_sum.txt");
    ofstream tidOutput2_sum("ListTid_8bit_sum.txt");

    for (int i = 0; i != measurementIters; i++)
    {
        double time = 0;

        for (int j = 0; j != repetionIters; j++) {
            time += testVector((i+1)*testSize, numRand);
        }

        std::ostringstream strs;
        strs << double(time/repetionIters);
        tidOutput1_sum << ((i+1)*testSize) << "," << strs.str() << endl;
    }

    for (int i = 0; i != measurementIters; i++)
    {
        double time = 0;

        for (int j = 0; j != repetionIters; j++) {
            time += testList((i+1)*testSize, numRand);
        }

        std::ostringstream strs;
        strs << double(time/repetionIters);
        tidOutput2_sum << ((i+1)*testSize) << "," << strs.str() << endl;
    }
    return 0;
}

struct FillSize
{
    double fill1;
};

The struct is just for me to easily add more values so I can test for elements with different size. I know that this code is probably not perfect concerning performance-testing, but they would rather have me make a simple example than simply reference to something I found.

I've tested this code on two computers now, both having the same issues. How can that be? And can you help me with a fix so I can graph it and present it Monday? Perhaps adding some seconds of wait time between each iteration will help?

Kind regards, Bjarke

pir
  • 5,513
  • 12
  • 63
  • 101
  • `AddRandomPlaceVector` should be `myContainer.insert(myContainer.begin()+place, myFill);` That for loop is slow and unneeded. Not 12 seconds slow though. Nothing here should take 12 seconds if your vector size is really 500. – David Dec 14 '13 at 12:08
  • Thanks Dave, please see my edit and comment on Blastfurnace's answer. Maybe you can give me a hint on that as well? – pir Dec 14 '13 at 14:35

1 Answers1

1

Your AddRandomPlaceVector function has a serious flaw. Using insert() will invalidate iterators so the for loop is invalid code.

If you know the desired insertion point there's no reason to iterate over the vector at all.

void AddRandomPlaceVector(vector<FillSize> &myContainer, int place)
{
    FillSize myFill;
    myContainer.insert(myContainer.begin() + place, myFill);
}
Community
  • 1
  • 1
Blastfurnace
  • 18,411
  • 56
  • 55
  • 70
  • Ah, of course. I thought of that, but then I would use the random access feature of vectors in my comparison between vectors and lists. As my comparison should illustrate a linear search (check for condition) then that would skew the performance. However, I will look into not invalidating my iterator! – pir Dec 14 '13 at 12:22
  • So I've updated my original post, but found another issue. Maybe you can help me with that as well? :) – pir Dec 14 '13 at 14:35
  • @user1452257: Are you testing an optimized/release build? Does the `list` code iterate over the __entire__ list on insert as well? I have to say that `std::vector` is a poor container choice if random insertions are the dominant operation. – Blastfurnace Dec 14 '13 at 14:48
  • No, I am testing a "debugging" build in CodeBlocks without any effort on optimization. That might be it. The code is exactly the same for the list. I could have the code not iterate over the entire container, but I guess if it is the same operation on the two containers then I won't matter. My thought is that "there just happens to be one element that fits the criteria", but the code should resemble a conditional search algorithm. According to also http://stackoverflow.com/questions/238008/relative-performance-of-stdvector-vs-stdlist-vs-stdslist I thought vector would be faster. – pir Dec 14 '13 at 15:01
  • @user1452257: Oh yeah, profiling an unoptimized build isn't a good idea. The standard library containers and algorithms heavily rely on templates and __really__ benefit from compiler optimizations. – Blastfurnace Dec 14 '13 at 15:04
  • Is there some easy way to make it "optimized"? ;) – pir Dec 14 '13 at 15:10
  • @user1452257: It depends on the compiler. I think CodeBlocks uses `gcc` (or is it `g++`) so the compiler command-line option `-O2` should enable optimizations. Not sure how to specify that in CodeBlocks. You should be able to google up an answer for that. Note that `-O2` is just one commonly used option, there are many more. – Blastfurnace Dec 14 '13 at 15:14
  • Thanks, -O2 worked brilliantly! Can you show me how to change my code to use templates so I can use "any" container and struct for testing? I've tried to contact you through chat, but I don't have enough reputation :) – pir Dec 14 '13 at 16:35
  • @user1452257: Although the basics aren't complicated, _templates_ are a large subject. I'd like to help but I think you'd be better off following a good C++ book. Have you seen [The Definitive C++ Book Guide and List](http://stackoverflow.com/questions/388242/the-definitive-c-book-guide-and-list)? – Blastfurnace Dec 14 '13 at 16:50