-2

I'm unable to determine whether having a growing and shrinking List vs using a big bool Array will be more efficient for my application.

To expand on this comparison and the actual situation, here are examples of each option I believe I have:

Option 1 (List):

public List<int> list = new List<int>();
while (true) { // game loop
    list.Add(Random.Range(0-300));
    list.Add(Random.Range(0-300));
    ...  // maximum of 10 of these can happen

    if (list.Contains(42)) { // roughly 10 - 50 of these checks can be true
        list.Remove(42);
    }
}

Option 2 (Array):

bool[] arr = new bool[300];
while (true) { // game loop
    arr[Random.Range(0-300)] = true;
    arr[Random.Range(0-300)] = true;
    ... // maximum of 10 of these can happen

    for (int i = 0; i < 300; i++) {
        if (arr[i]) { // roughly 10 - 50 of these checks can be true
            arr[i] = false;
        }
    }
}

So essentially my question is:

At what point does too many .Contains checks become more expensive than a for loop over each possible element (based on my ranges)?

IMPORTANT

This is not a List vs Array question. The datatypes are important because of the condition checks. So it is specifically an integer list vs bool array comparison since these two options can give me the same results.

FanManPro
  • 1,076
  • 1
  • 13
  • 31
  • 2
    Your two versions of the code doesn't seem to be doing the same thing. Typo somewhere? – Sweeper Dec 20 '18 at 02:10
  • 1
    https://ericlippert.com/2012/12/17/performance-rant/ – mjwills Dec 20 '18 at 02:23
  • Guess what you should have searched is how is contains implemented? As per MSDN documentation it does a linear search, so O(N), which is kind of similar to searching an array I guess. Nice link @mjwills. – peeyush singh Dec 20 '18 at 02:49
  • Reason I can't just benchmark it is because it is not as easy as just putting the two up against each other. I have to actually build out the application and at that stage I would be too invested in an option. A lot more happens after the ellipses `...`. – FanManPro Dec 20 '18 at 02:54
  • @Sweeper correct there was a typo. I fixed it, sorry. – FanManPro Dec 20 '18 at 02:54
  • @peeyushsingh I believe my typo and phrasing of my question might be a bit confusion. Just note that it is intentionally a list of integers and a array of booleans. – FanManPro Dec 20 '18 at 03:01
  • 2
    The equivalent of `Contains(42)` with the array would just be `if(arr[42])` and to "remove" it would just be `arr[42] = false;`which is obviously faster. Basically you trade lookup time for space which is usually the trade off with performance. Another difference to keep in mind is that the list can contain duplicates, but the array implementation only indicates if a number was randomly selected, but not how many times. – juharr Dec 20 '18 at 04:32

2 Answers2

1

I would say the array implementation would be much faster. In addition to the cost of resizing the array internally when you call List.Add(T) or List.Remove(T), if you check the List implementation code. You will notice the List.Contains(T) and List.Remove(T) both are using IndexOf(T) in which I believe is having looping/iteration through the list internally. In your example, you want to call List.Contains(T) and List.Remove(T) around 10-50 times. It means at best case it will cost you 20 (contains+remove), but in the worst case it will cost you (N * 50) + N where N is the number of items in your list.

With this information, I could conclude if your list growing bigger, the performance will much worse.

If you're looking more into performance, maybe it's worth taking a look at HashSet data structure. It has much better performance in look up and remove operations than a List.

Anang Satria
  • 1,226
  • 1
  • 15
  • 19
0

Here's an interesting writeup on Array vs List for both for, foreach, EnumerableForEach and Sum by Jon Skeet:

https://codeblog.jonskeet.uk/2009/01/29/for-vs-foreach-on-arrays-and-lists/

As per the article, the performance goes like this:

============ int[] ============
For                 1.00
ForHoistLength      2.03
ForEach             1.36
IEnumerableForEach 15.22
Enumerable.Sum     15.73

============ List<int> ============
For                 2.82
ForHoistLength      3.49
ForEach             4.78
IEnumerableForEach 25.71
Enumerable.Sum     26.03

Results can be quantified over like int array for a for loop is 2.8 times faster. If you know the size of an array and its fixed, go with Array, else List.

Here is another link: Performance of Arrays vs. Lists

and also, stay away from Linq for large data and go with for/foreach loops.

Gauravsa
  • 6,330
  • 2
  • 21
  • 30
  • I know that arrays will be better than lists, the comparison is more about an int list vs a bool array where the index in the array acts as the "int" and the value being the condition. – FanManPro Dec 20 '18 at 02:56
  • The results are for summing up the array or list, not for lookup time. Though generally I would expect the array to be a faster than list. – peeyush singh Dec 20 '18 at 03:03
  • It would be nice if you rerun the benchmark today, because a lot changed in the last ten years. – Antonín Lejsek Dec 20 '18 at 04:12
  • 1
    `the comparison is more about an int list vs a bool array` Your original question @FanusduToit , where the array was an int array rather than a bool array, likely confused most of the readers (including Guaravsa and myself). – mjwills Dec 20 '18 at 04:40