0

I have a project that works with a large amount of objects of the same type.

For now I use List<Person>, but it seems that looping through this list is hard when I have about 1 000 000 items.

In the loop, each Person has a method called and there are randomly generated new items, and some items are deleted.

What can I do to optimize this loop ? Should I change the collection type, or move items to database ?

This is how the loop looks like:

while (_worldIsLiving)
{
    int beginningPopulation = WorldPopulationNumber;

    for (int i = 0; i < beginningPopulation; i++)
    {
        _worldPopulation[i].InvokeSkills(this);
    }

    // Remove dead persons
    for (int i = 0; i < WorldPopulationNumber;)
    {
        if (_worldPopulation[i].IsDead()) _worldPopulation.RemoveAt(i);
        else i++;
    }

    WorldCycles++;
    if (_hasStopCondition && (WorldPopulationNumber >= WorldMaximumPopulation || WorldPopulationNumber == 0))
        Destroy();
}

In _worldPopulation[i].InvokeSkills(this); new persons can be generated.

Skill has a chanceToBeInvoken and chanceTobeInherited fields.

Community
  • 1
  • 1
Markiian Benovskyi
  • 2,137
  • 22
  • 29
  • The only thing faster to iterate through would be a plain array, but that would likely make only a marginal difference. Any optimisations would typically be in the code in the loop, rather than the container. To answer this question, we need to know more about the code in the loop. – Matthew Watson Oct 10 '16 at 13:01
  • 2
    WHen you say "looping through this list is hard" what do you mean by "hard"? It's just as easy as looping through a list with just a few elements - it just takes longer. – Enigmativity Oct 10 '16 at 13:07
  • Items are stored in-memory. `Person` has List of skills, each Skill has a chance to be invoked. The code in the main loop: I invoke method in `Person` class that invokes random skills. @MatthewWatson Looping is hard because on larger amount of items one cycle takes about a minute. @Enigmativity – Markiian Benovskyi Oct 10 '16 at 13:14
  • 1
    It seems very likely that focusing on the loop rather than the operations performed in the loop is a waste time. As such, your question completely lacks the information to be answered properly. – spender Oct 10 '16 at 13:15
  • 1
    @MarkBenovsky `_worldPopulation.RemoveAt(i)` on a list will be an expensive operation. It involves shunting every subsequent item down by a position, and you'll be doing this multiple times for every "dead" instance. This will multiply up to quite an overhead for a long list, with O(n2) performance (if my CS hat is being worn correctly today). It would likely be much faster to just write out a new list: `_worldPopulation = _worldPopulation.Where(p=>!p.IsDead())` – spender Oct 10 '16 at 13:46
  • Voted to reopen as OP has provided more info. – spender Oct 10 '16 at 13:50

1 Answers1

3

_worldPopulation.RemoveAt(i) on a list will be an expensive operation.

It involves shunting every subsequent item down by a position. You're calling this multiple times in every iteration of the outer loop, once for every "dead" instance.

This will multiply up to a very significant overhead for a long list, with O(n2) complexity (if my CS hat is being worn correctly today).

It would likely be much faster to just write out a new list:

_worldPopulation = _worldPopulation.Where(p=>!p.IsDead())

If this still seems expensive, is it important to prune the list at all? (can the list hold all members of the population both alive and dead, or will this strain the available memory?)

You could, for instance:

var livingPopulace = _worldPopulation.Where(p => !p.IsDead());
foreach(var pop in livingPopulace)
{
    pop.InvokeSkills(this)
}
var newPopulationCount = _worldPopulation.Count(p => !p.IsDead());

Although this requires 2 sweeps of the collection, it's still going to be less passes over the collection than using RemoveAt, especially if the death-rate per cycle is high. You're back to O(n) complexity and I suspect that with a large collection, this will be significantly more efficient that using RemoveAt.

If neither of these is satisfactory, you might consider using a LinkedList<T> as your container which supports easy forward iteration and low cost removal (at the expense of random-access indexing), but there might be other constraints that make this impractical. Nothing in the code you supplied suggests that this wouldn't work. Just don't be tempted by Linq operators such as ElementAt to get round the random-access limitations or you'll be back to the same kinds of problem.

spender
  • 117,338
  • 33
  • 229
  • 351