1

I am looking at some legacy code. The class uses an ArrayList to keep the items. The items are fetched from Database table and can be up to 6 million. The class exposes a method called 'ListCount' to get the count of the items in the Arraylist.

Class Settings
{
    ArrayList settingsList ;
    public Settings()
    {
      settingsList = GetSettings();//Get the settings from the DB. Can also return null
    }

    public int ListCount
    {
        get
        {
           if (settingsList == null )
             return 0;
           else
             return settingsList.Count;
        }
    }
}

The ListCount is used to check if there are items in the list. I am wondering to introduce 'Any' method to the class.

public bool Any(Func<vpSettings, bool> predicate)
{
   return settingsList !=null && settingsList.Cast<vpSettings>().Any(predicate);
}

The question is does the framework do some kind of optimization and maintains a count of the items or does it iterate over the Arraylist to get the count? Would it be advisable to add the 'Any' method as above.

Marc Gravel in the following question advises to use Any for IEnumerable

Which method performs better: .Any() vs .Count() > 0?

Community
  • 1
  • 1
AlwaysAProgrammer
  • 2,927
  • 2
  • 31
  • 40
  • Can you show `GetSettings`? As for the example, `IList` and `IDictionary` derive from `ICollection`. Which is one of the benefits that extra layer of flexibility. An `Array` should have less overhead and be faster. – Greg Apr 29 '14 at 22:23
  • @Greg..The GetSettings() does bunch of things and calls methods in other classes to ultimately read from DB. Honestly the code is all over the places for the method. – AlwaysAProgrammer Apr 29 '14 at 22:26

4 Answers4

4

The .NET reference source says that ArrayList.Count returns a cached private variable.

For completeness, the source also lists the implementation of the Any() extension method here. Essentially the extension method does a null check and then tries to get the first element via the IEnumerable's enumerator.

  • 3
    It's one of my favorite things to come out of the Roslyn project - if you haven't seen it before, there is also a way to integrate it with Visual Studio so that you can inspect/step through the code. Very handy for getting quick answers (or figuring out the structure). Here's [more info](http://blogs.msdn.com/b/dotnet/archive/2014/02/24/a-new-look-for-net-reference-source.aspx) – Andrew Varnerin Apr 29 '14 at 22:28
  • 1
    @Yogendra The .NET reference source did not itself come out of the Roslyn project; it has been available since 2007. The experience has, however, recently been enhanced in connection with Roslyn. See also http://visualstudiomagazine.com/articles/2014/02/26/dotnet-source-updated-via-roslyn.aspx. – phoog Apr 30 '14 at 17:34
2

The ArrayList is actually implementing IList, which should be faster than the .Any(). The reason though because it is implementing the Count Property not the Count Method. The Count Property should do a quick check then grab the proper property.

Which looks similar to:

ICollection<TSource> collection1 = source as ICollection<TSource>;

  if (collection1 != null)
    return collection1.Count;

  ICollection collection2 = source as ICollection;

  if (collection2 != null)
    return collection2.Count;
Greg
  • 11,302
  • 2
  • 48
  • 79
2

Marc Gravel advises to use Any() over Count() (the extension method), but not necessarily over Count (the property).

The Count property is always going to be faster, because it's just looking up an int that's stored on the heap. Using linq requires a (relatively) expensive object allocation to create the IEnumerator, plus whatever overhead there is in MoveNext (which, if the list is not empty, will needlessly copy the value of the ArrayList's first member to the Current property before returning true).

Now this is all pretty trivial for performance, but the code to do it is more complex, so it should only be used if there is a compelling performance benefit. Since there's actually a trivial performance penalty, we should choose the simpler code. I would therefore implement Any() as return Count > 0;.

However, your example is implementing the parameterized overload of Any. In that case, your solution, delegating to the parameterized Any extension method seems best. There's no relationship between the parameterized Any extension method and the Count property.

phoog
  • 42,068
  • 6
  • 79
  • 117
  • In my example the arraylist is already populated. So the predicate would do in-memory filtering. – AlwaysAProgrammer Apr 29 '14 at 23:03
  • @Yogendra I think I understand. I was a little confused by the fact that your first example concerned the Count property while the second concerned the filtered Any method. Implementing this as you do in your example has the advantage of encapsulating the Cast call, supporting the DRY principle (http://en.wikipedia.org/wiki/Don't_repeat_yourself). – phoog Apr 30 '14 at 17:27
1

ArrayList implements IList so it does have a Count property. Using that would be faster than Any(), if all you care is check the container (non-)emptiness.

Marius Bancila
  • 16,053
  • 9
  • 49
  • 91