3

Context

I'm using a QLinkedList to store some class I wrote.
The fact is I must iterate a lot over this list.
By a lot I mean the program I write makes infinite calculus (well, you can still stop it manually) and I need to get through that QLinkedList for each iteration.

Problem

The problem is not if I'm iterating to much over this list.

It's that I'm profiling my code and I see that 1/4 of the time is spent on QLinkedList::end() and QLinkedList::begin() functions.

Sample code

My code is the following :

typedef QLinkedList<Particle*> ParticlesList;  // Particle is a custom class

ParticlesList* parts = // assign a QLinkedList

for (ParticlesList::const_iterator itp = parts->begin(); itp != parts->end(); ++itp)
{
    //make some calculus
}

Like I said, this code is called so often that it spends a lot of time on parts->begin() and parts->end().

Question

So, the question is how can I reduce the time spent on the iteration of this list ?

Possible solutions

Here are some solutions I've thought of, please help me choose the best or propose me another one :)

  • Use of classic C array : // sorry for this mistake
Particle** parts = // assing it something
for (int n = 0; n < LENGTH; n++)
{
    //access by index
    //make some calculus
}

This should be quick right ?

  • Maybe use Java style iterator ?
  • Maybe use another container ?
  • Asm ? Just kidding... or maybe ?

Thank you for your future answers !

PS : I have read stackoverflow posts about when to profile so don't worry about that ;)

Edit :

The list is modified

I'm sorry I think I forgot the most important, I'll write the whole function without stripping :

typedef std::vector<Cell*> Neighbours;
typedef QLinkedList<Particle*> ParticlesList;

Neighbours neighbours = m_cell->getNeighbourhood();
Neighbours::const_iterator it;

for (it = neighbours.begin(); it != neighbours.end(); ++it) 
{
    ParticlesList* parts = (*it)->getParticles();
    for (ParticlesList::const_iterator itp = parts->begin(); itp != parts->end(); ++itp)
    {
        double d = distanceTo(*itp); // computes sqrt(x^2 + y^2)
        if(d>=0 && d<=m_maxForceRange)
        {
            particleIsClose(d, *itp); // just changes 
        }
    }
}

And just to make sure I'm complete, this whole code is called in a loop ^^.

So yes the list is modified and it is in a inner loop. So there's no way to precompute the beginning and end of it.

And moreover, the list needs to be constructed at each big iteration (I mean in the topmost loop) by inserting one by one.

Debug mode

Yes indeed I profiled in Debug mode. And I think the remark was judicious because the code went 2x faster in Release. And the problem with lists disappeared.

Thanks to all for your answers and sorry for this ^^

ibizaman
  • 3,053
  • 1
  • 23
  • 34
  • Do you always need to process all the items on the list? Do you do some desicions inside the loop? – vz0 Oct 28 '11 at 03:30
  • 2
    That's not a "classic C++ array", but a classic C array, if anything. In C++ use a `std::vector`. – GManNickG Oct 28 '11 at 04:14
  • Anyway, does the list ever change once you start the loop, or can elements be inserted during the loop? – GManNickG Oct 28 '11 at 04:14
  • 1
    How many items are there in the list? If you have one item, for example, you will be calling `begin()`, 2 `end()` and one `QLinkedList::iterator::operator++` in each iteration of the loop... If the list is actually *long* then I don't really believe that `begin()` and `end()` would be the highest cost (unless the body of the loop is almost empty). Consider using `std::vector` for contiguous memory, compile in release mode with optimizations... the usual – David Rodríguez - dribeas Oct 28 '11 at 07:34
  • 1
    You might want to [see this](http://stackoverflow.com/questions/7916985/what-is-boilerplate-code-hot-code-and-hot-spots/7923574#7923574). – Mike Dunlavey Oct 28 '11 at 17:16

4 Answers4

5

If you are profiling in debug mode, a lot of compilers disable inlineing. The begin() and end() times being high may not be "real". The method call times would be much higher than the equivalent inline operations.

Something else I noticed in the full code, you're doing a sqrt in the inner loop. They can be fairly expensive depending on the hardware architecture. I would consider replacing the following code:

 double d = distanceTo(*itp); // computes sqrt(x^2 + y^2) 
 if(d >= 0 && d <= m_maxForceRange) 

with:

 double d = distanceToSquared(*itp); // computes x^2 + y^2
 if(d >= 0 && d <= m_maxForceRangeSquared)

I've done this in code where I was doing collison detection and it sometimes makes a noticible improvement. The tests are equivalent and saves a lot of calls to sqrt. As always with optimization, measure to verify if it improves the speed.

Clinton
  • 22,361
  • 15
  • 67
  • 163
Wayne Tanner
  • 1,346
  • 11
  • 18
2

Pre-computing the end iterator will help if your compiler isn't smart enough to realise it is const, and is hence computing it each time through the loop. You can do that like below:

const ParticlesList::const_iterator itp_end = parts->end();

for (ParticlesList::const_iterator itp = parts->begin(); itp != itp_end; ++itp)
{
    //make some calculus
}

I can't understand why parts->begin(); is taking so long, it should only be used once. However, if this loop is inside another loop, you could do something like this:

const ParticlesList::const_iterator itp_begin = parts->begin();
const ParticlesList::const_iterator itp_end = parts->end();

for (...)
{
  for (ParticlesList::const_iterator itp = itp_begin; itp != itp_end; ++itp)
  {
      //make some calculus
  }
}

But I can't imagine this will make too much difference (unless your inner list is really short), but it shouldn't hurt much either.

On a further note, a linked list possibly isn't the fastest data structure for your purposes. Linked lists are most useful when you frequently need to insert items into the middle of the list. If the list is built and then fixed, you're probably better off with a std::vector. A std::vector may also be better even if you occasionally only need to add/remove items from the end (not the beginning or middle). If you have to add/remove from the beginning/end (but not middle) consider a std::deque.

Clinton
  • 22,361
  • 15
  • 67
  • 163
  • Good answer. Also, hoisted iterators should be made `const` themselves (that is, `const ParticlesList::const_iterator itp_end = ...`). – Drew Hall Oct 28 '11 at 05:54
1

If you absolutely need raw speed you should measure each possible choice you encounter, and keep the fastest.

Sounds like the list remains unchanged while you iterate over it. I'd try by storing the end of the list on a local variable.

typedef QLinkedList<Particle*> ParticlesList;  // Particle is a custom class

ParticlesList* parts = // assign a QLinkedList
ParticlesList::const_iterator end = parts->end();

for (ParticlesList::const_iterator itp = parts->begin(); itp != end; ++itp)
{
    // make some calculus
}
vz0
  • 32,345
  • 7
  • 44
  • 77
  • 2
    Might as well do it this way: `for (ParticlesList::const_iterator itp = parts->begin(), end = parts->end(); ...`. It's shorter and keeps the variable in the for-loop scope. – GManNickG Oct 28 '11 at 04:15
1

Qt containers are compatible with STL algorithms like std::for_each.

Try something like this:

std::for_each( parts->begin(), parts->end(), MyParticleCalculus );

where MyParticleCalculus is a functor that contains your calculus.

Qt also has its own foreach, but it's apparently just a macro to hide the iterators, so it probably won't give you any performance benefit.

(Edit: I'm recommending std::for_each per Scott Meyer's recommendation in "Effective STL": "Prefer algorithm calls to hand-written loops.")

Gnawme
  • 2,321
  • 1
  • 15
  • 21