0

I'm working on a .net app and I'm using an object that is composed of 5 Lists and 1 Hashtable. This object is used within a loop that iterates at least 500 times to run some analysis. On each loop this object should start empty, so I was wondering if it's more efficient to call clear on all Lists and Hashtable or should I just re-initialize the object?

I know I could write code to benchmark this, but I'm wondering if someone has already been down this path?

Thanks.

nawfal
  • 70,104
  • 56
  • 326
  • 368
webber
  • 1,834
  • 5
  • 24
  • 56
  • "I know I could write code to benchmark this" - your circumstances and code are different to everyone else. Do the benchmark. I would add that this appears to be a micro-optimization. Are you actually experiencing performance issues with your code as it stands? – Oded Apr 24 '12 at 12:31
  • It really depends on what you are doing with the object and collections. You can even put a flag and say it's empty, then overwrite contents without clearing. – Mert Akcakaya Apr 24 '12 at 12:32
  • possible duplicate of [What is faster in c#: clear collection or instantiate new](http://stackoverflow.com/questions/10901020/what-is-faster-in-c-clear-collection-or-instantiate-new) – nawfal Jun 02 '14 at 15:01

2 Answers2

6

The cost of creating 3000 empty collections will be tiny. Unless your "analysis" is really trivial, this isn't going to be significant at all. Write the clearest code you can - which is likely to be creating a new set of collections each time rather than reusing them. You should only reuse an object if the logical operation is to reuse it.

Once you've written the code in the most readable way, test whether it performs as well as you need it to. If it doesn't, then you can start micro-optimizing.

I would, however, strongly recommend that you use Dictionary<,> instead of Hashtable.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • Even in a benchmark creating 100,000 empty collections I'm only seeing a difference of 00:00.001 in the timings. Readability first and foremost. Nice answer Jon. – Jamie Dixon Apr 24 '12 at 12:39
6

While I agree with the other answer that this is a micro optimization, in the interests of answering the question, I have found that new List is slightly faster than Clear when using a List. Here's my benchmark code:

static void Main(string[] args)
    {
        var start = DateTime.Now;
        List<string> lst = new List<string>();

        for (int i = 0; i < 3000; ++i)
        {
            //lst = new List<string>();
            lst.Clear();
            for (int j = 0; j < 500; ++j)
            {
                lst.Add(j.ToString());
            }
        }

        Console.WriteLine("{0} ms", ((DateTime.Now - start).Ticks / TimeSpan.TicksPerMillisecond));
        Console.ReadLine();
    }

Over five runs, new List averaged 340.8 ms, and Clear averaged 354.8 ms.

However, this results are so close that it's clear that:

  1. The difference is probably meaningless
  2. I probably wasted my time by performing this benchmark
Ian Newson
  • 7,679
  • 2
  • 47
  • 80
  • I don't think you wasted your time at all. Thanks for putting forth the effort. – DOK Apr 24 '12 at 13:01
  • One, you are using DateTime class for this purpose. Use StopWatch better for greater accuracy. Two, you're doing another major operation in your testing code (ie, `List.Add`) which itself will take a lot of time giving you hardly meaningful values. Three, `new List()` is a different operation compared to `.Clear()` in that the latter preserve capacity (so that later additions become faster). You could achieve the same in case of former by doing: `new List(lst.Capacity);` +1 for benchmark still. – nawfal Jun 02 '14 at 15:05
  • 1
    @nawfal Thanks for the comments and upvote! Totally agree with the point regarding StopWatch; nowadays I tend to use this class over DateTime, and would urge others to do the same. I also agree with the point about lst.Add. In hindsight I should have populated the list outside of the region of code being benchmarked. While this will obscure the true results in terms of time elapsed since the same code is being used for both test cases the result, in terms of which is fastest, should still be valid. – Ian Newson Jun 02 '14 at 15:31