2

I was under the impression that it was quicker to look up items in a Dictionary than a List, the follow code seems to suggest otherwise:

Dictionary : 66 ticks

List: 32 ticks

Im assuming ive screwed up somewhere?

static void Main(string[] args)
    {
        // Speed test.
        Dictionary<string, int> d = new Dictionary<string, int>()
        {
            {"P1I1-1MS    P2I1-1MS    3I-1MS    4I-1MS", 2},
            {"P1I2-1MS    P2I1-1MS    3I-1MS    4I-1MS", 1},
            {"P1I3-1MS    P2I1-1MS    3I-1MS    4I-1MS", 0},
            {"P1I4-1MS    P2I1-1MS    3I-1MS    4I-1MS", -1},
            {"P1I5-1MS    P2I1-1MS    3I-1MS    4I-1MS", 0},
            {"P1I1-1MS    P2I2-1MS    3I-1MS    4I-1MS", 0},
            {"P1I2-1MS    P2I2-1MS    3I-1MS    4I-1MS", 0},
            {"P1I3-1MS    P2I2-1MS    3I-1MS    4I-1MS", 0},
            {"P1I4-1MS    P2I2-1MS    3I-1MS    4I-1MS", 0},
            {"P1I5-1MS    P2I2-1MS    3I-1MS    4I-1MS", 0},
            {"P1I1-1MS    P2I3-1MS    3I-1MS    4I-1MS", 0},
            {"P1I2-1MS    P2I3-1MS    3I-1MS    4I-1MS", 0},
            {"P1I3-1MS    P2I3-1MS    3I-1MS    4I-1MS", 0},
            {"P1I4-1MS    P2I3-1MS    3I-1MS    4I-1MS", 0},
            {"P1I5-1MS    P2I3-1MS    3I-1MS    4I-1MS", 0},
            {"P1I1-1MS    P2I4-1MS    3I-1MS    4I-1MS", 2} 
        };

        List<string> l = new List<string>();
            l.Add("P1I1-1MS    P2I1-1MS    3I-1MS    4I-1MS");
            l.Add("P1I2-1MS    P2I1-1MS    3I-1MS    4I-1MS");
            l.Add("P1I3-1MS    P2I1-1MS    3I-1MS    4I-1MS");
            l.Add("P1I4-1MS    P2I1-1MS    3I-1MS    4I-1MS");
            l.Add("P1I5-1MS    P2I1-1MS    3I-1MS    4I-1MS");
            l.Add("P1I1-1MS    P2I2-1MS    3I-1MS    4I-1MS");
            l.Add("P1I2-1MS    P2I2-1MS    3I-1MS    4I-1MS");
            l.Add("P1I3-1MS    P2I2-1MS    3I-1MS    4I-1MS");
            l.Add("P1I4-1MS    P2I2-1MS    3I-1MS    4I-1MS");
            l.Add("P1I5-1MS    P2I2-1MS    3I-1MS    4I-1MS");
            l.Add("P1I1-1MS    P2I3-1MS    3I-1MS    4I-1MS");
            l.Add("P1I2-1MS    P2I3-1MS    3I-1MS    4I-1MS");
            l.Add("P1I3-1MS    P2I3-1MS    3I-1MS    4I-1MS");
            l.Add("P1I4-1MS    P2I3-1MS    3I-1MS    4I-1MS");
            l.Add("P1I5-1MS    P2I3-1MS    3I-1MS    4I-1MS");
            l.Add("P1I1-1MS    P2I4-1MS    3I-1MS    4I-1MS");


        Stopwatch sw = new Stopwatch();

        string temp = "P1I1-1MS    P2I4-1MS    3I-1MS    4I-1MS";

        bool inDictionary = false;

        sw.Start();
        if (d.ContainsKey(temp))
        {
            sw.Stop();
            inDictionary = true;
        }
        else sw.Reset();

        Console.WriteLine(sw.ElapsedTicks.ToString());
        Console.WriteLine(inDictionary.ToString());


        bool inList = false;

        sw.Reset();
        sw.Start();
        if (l.Contains(temp))
        {
            sw.Stop();
            inList = true;
        }
        else sw.Reset();

        Console.WriteLine(sw.ElapsedTicks.ToString());
        Console.WriteLine(inList.ToString());

        Console.ReadLine();
    }

EDIT

Modification to Matthew Watson's code.

Hans Rudel
  • 3,433
  • 5
  • 39
  • 62
  • 1
    Are you running this test in the debugger or the executable? What about the mode, debug/release? – DGibbs Aug 15 '13 at 08:09
  • Yeah its being run in debugger (debug). .net 4 client profile – Hans Rudel Aug 15 '13 at 08:10
  • 5
    [Bad idea](http://tech.pro/blog/1293/c-performance-benchmark-mistakes-part-one), you should also be running it in release mode. – DGibbs Aug 15 '13 at 08:11
  • If I understand the math correctly, the difference should be most noticeable for big lists / dictionaries. I would suggest that you build a huge list ( ca. 1000 - 10.000 entries) and try again. Also., use release mode. – Christian Sauer Aug 15 '13 at 08:11
  • 4
    One execution and case is way to little to provide valid data. You should run the search 10k times and have it search for different matches. Your matches are all the same length and contain much the same data, this may reduce some of the dictionary benefits. – MrFox Aug 15 '13 at 08:12
  • I suspect that the problem with the dictionary is the poor distribution of the keys. – Alessandro D'Andria Aug 15 '13 at 08:15
  • @DGibbs Thanks for the link. I compiled to release and used ctrl-F5 but dictionary was still x2 slower. Regarding the other comments about running the test on a larger amount of data, completely agree, i was just surprised it didnt kick the lists ass straight away. MrFox, interesting comment, i will have a look into that. – Hans Rudel Aug 15 '13 at 08:16
  • @HansRudel - why are you surprised? List obviously much simple structure - so for single element it must be faster, and it is generally know that for large amount of date dictionary gives you result in O(1) time compared to O(n) for list. So clearly there is a range where list is faster in lookup (likely for case of 0-10 items). – Alexei Levenkov Aug 15 '13 at 08:20
  • 1
    @HansRudel I've just run the test in release and not in the debugger over 1000000 iterations and dictionary is definitely faster: `Dict: 97628.10ns vs List: 10396.84ns` though the difference is not much. I suspect this is due to the relatively small sizes of each collection. – DGibbs Aug 15 '13 at 08:25
  • The dictionary is about 9.5 times slower as you compare the numbers – Jeroen van Langen Aug 15 '13 at 08:29
  • @AlexeiLevenkov because of the following http://www.dotnetperls.com/dictionary-time DGibbs, thanks for the update. I'll need to have a think then regarding how mnay elements im likely to search through and weather list or dictionary is going to be better. – Hans Rudel Aug 15 '13 at 08:43

3 Answers3

4

Here's a proper test:

Dictionary<string, int> d = new Dictionary<string, int>()
{
    {"P1I1-1MS    P2I1-1MS    3I-1MS    4I-1MS", 2},
    {"P1I2-1MS    P2I1-1MS    3I-1MS    4I-1MS", 1},
    {"P1I3-1MS    P2I1-1MS    3I-1MS    4I-1MS", 0},
    {"P1I4-1MS    P2I1-1MS    3I-1MS    4I-1MS", -1},
    {"P1I5-1MS    P2I1-1MS    3I-1MS    4I-1MS", 0},
    {"P1I1-1MS    P2I2-1MS    3I-1MS    4I-1MS", 0},
    {"P1I2-1MS    P2I2-1MS    3I-1MS    4I-1MS", 0},
    {"P1I3-1MS    P2I2-1MS    3I-1MS    4I-1MS", 0},
    {"P1I4-1MS    P2I2-1MS    3I-1MS    4I-1MS", 0},
    {"P1I5-1MS    P2I2-1MS    3I-1MS    4I-1MS", 0},
    {"P1I1-1MS    P2I3-1MS    3I-1MS    4I-1MS", 0},
    {"P1I2-1MS    P2I3-1MS    3I-1MS    4I-1MS", 0},
    {"P1I3-1MS    P2I3-1MS    3I-1MS    4I-1MS", 0},
    {"P1I4-1MS    P2I3-1MS    3I-1MS    4I-1MS", 0},
    {"P1I5-1MS    P2I3-1MS    3I-1MS    4I-1MS", 0},
    {"P1I1-1MS    P2I4-1MS    3I-1MS    4I-1MS", 2} 
};

List<string> l = new List<string>
{
    "P1I1-1MS    P2I1-1MS    3I-1MS    4I-1MS", 
    "P1I2-1MS    P2I1-1MS    3I-1MS    4I-1MS", 
    "P1I3-1MS    P2I1-1MS    3I-1MS    4I-1MS", 
    "P1I4-1MS    P2I1-1MS    3I-1MS    4I-1MS", 
    "P1I5-1MS    P2I1-1MS    3I-1MS    4I-1MS",
    "P1I1-1MS    P2I2-1MS    3I-1MS    4I-1MS", 
    "P1I2-1MS    P2I2-1MS    3I-1MS    4I-1MS",
    "P1I3-1MS    P2I2-1MS    3I-1MS    4I-1MS",
    "P1I4-1MS    P2I2-1MS    3I-1MS    4I-1MS",
    "P1I5-1MS    P2I2-1MS    3I-1MS    4I-1MS", 
    "P1I1-1MS    P2I3-1MS    3I-1MS    4I-1MS", 
    "P1I2-1MS    P2I3-1MS    3I-1MS    4I-1MS",
    "P1I3-1MS    P2I3-1MS    3I-1MS    4I-1MS", 
    "P1I4-1MS    P2I3-1MS    3I-1MS    4I-1MS",
    "P1I5-1MS    P2I3-1MS    3I-1MS    4I-1MS", 
    "P1I1-1MS    P2I4-1MS    3I-1MS    4I-1MS"
};

int trials = 4;
int iters  = 1000000;

Stopwatch sw = new Stopwatch();

string target = "P1I1-1MS    P2I4-1MS    3I-1MS    4I-1MS";

for (int trial = 0; trial < trials; ++trial)
{
    sw.Restart();

    for (int i = 0; i < iters; ++i)
        d.ContainsKey(target);

    sw.Stop();
    Console.WriteLine("Dictionary took " + sw.Elapsed);
    sw.Restart();

    for (int i = 0; i < iters; ++i)
        l.Contains(target);

    sw.Stop();
    Console.WriteLine("List took " + sw.Elapsed);
}

Run a RELEASE build of this OUTSIDE any debugger.

My results:

Dictionary took 00:00:00.0587588
List took 00:00:00.2018361
Dictionary took 00:00:00.0578586
List took 00:00:00.2003057
Dictionary took 00:00:00.0611053
List took 00:00:00.2033325
Dictionary took 00:00:00.0583175
List took 00:00:00.2056591

Dictionary is clearly faster, even with so few entries.

With more entries, Dictionary will be even faster compared to list.

The lookup time using a Dictionary is O(1), whereas a List takes O(N). That will of course make a HUGE difference for large values of N.

Matthew Watson
  • 104,400
  • 10
  • 158
  • 276
  • Hi, thanks for the code. Ive run it and get similar times to you. Ive just modified it so that in each For-Loop, it generates chooses a new target via target = l[random.Next(l.Count)]; but the times are now ~ the same. Any idea why? See edit to original question for picture. – Hans Rudel Aug 15 '13 at 08:38
  • 1
    @HansRudel Well, firstly you neglected to add braces `{}` so the loops are both just accessing the list (and doing nothing with the dictionary inside the loop). Secondly, if you *do* add the random lookup inside the loop, you will also have added to the loop the amount of time it takes to index a `List<>`, which will skew the results. Furthermore, as soon as you add a random component to a list you destroy any consistency - the targets randomly chosen for one of the loops may be much "worse" than the ones randomly chosen for the other loop. – Matthew Watson Aug 15 '13 at 08:48
  • "you neglected to add braces {}"... D'oh. – Hans Rudel Aug 15 '13 at 09:00
  • Updated the times. Regarding your 2nd point, id have thought that since there is a sufficiently large number of iterations, this problem would be negligible. Anyway, thanks for your help, i appreciate it. – Hans Rudel Aug 15 '13 at 09:05
2

Dictionary is faster than List in assimptotic meaning of the word: O(1) < O(n); it means that there's such a size X from which on Dictionary starts being faster than List. E.g. if Dictionary/List contains, say, 10000 or more items, Dictionary is always faster (if there're fewer items to store, say, 5, List can well be faster). There's another issue with Dictionary: it requires GetHashCode() method; if GetHashCode() is bad implemented Dictionary can be deadly slow.

Dmitry Bychenko
  • 180,369
  • 20
  • 160
  • 215
  • Am i correct in assuming then that "P1I4-1MS P2I2-1MS 3I-1MS 4I-1MS" is a poor key? Each of the objects im dealing with contains x inputs and the list of these inputs = 1 permutation => being unique. So i thought that might be a good way to store which object i had already delt with as i had read about hash conflicts. – Hans Rudel Aug 15 '13 at 08:22
  • 1
    No, default GetHashCode() implementations (in your case - String.GetHashCode()) are quite good; it seems that you test Dictionary/List with too few items. Try, say, add 100000 different keys/items into List and Dictionary... – Dmitry Bychenko Aug 15 '13 at 08:39
2

This is due to the law of large numbers. In short

According to the law, the average of the results obtained from a large number of trials should be close to the expected value, and will tend to become closer as more trials are performed.

Another constrain is the Big-O notation , which is in fact useless on a small scale. For example you can say that O(1) ~ O(N) ~ O(n!) for a given n that is less than some small number.

Running a good experiment requires meeting some pretty strict conditions like:

  • Make sure algorithms are comparable
  • Have large number of iterations in the algorithm
  • Run on the same hardware
  • Run in release mode with max optimization
  • No Debugger or Performance analysis tools should be attached
  • Conduct many experiments and calculate average + standard deviation
  • ...
Community
  • 1
  • 1
oleksii
  • 35,458
  • 16
  • 93
  • 163