1

I need to add a bunch of items to a data structure and then later access ALL of the items within it in a random order. How can I do this?

To be more specific, I am currently adding URLs to a List<string> object. They are added in a way such that adjacent URLs are likely to be on the same server. When I access the List using a Parallel.ForEach statement, it just returns the items in the order that I added them. Normally this is okay, but when I am making web requests in parallel, this tends to overwhelm some servers and leads to timeouts. What data structure can I use that will return items in a more random way when I run a Parallel.ForEach statement on the object (i.e., not in the order that I added them)?

Blorgbeard
  • 101,031
  • 48
  • 228
  • 272
Doug
  • 5,116
  • 10
  • 33
  • 42
  • 3
    Maybe just [shuffle](http://stackoverflow.com/questions/273313/randomize-a-listt-in-c-sharp) the list once you've added all your items? – Blorgbeard May 13 '13 at 23:33
  • 5
    Is random really what you want? Or would you rather have a list of servers, each with a list of URLs, then run Parallel.ForEach on the servers, but not on the URLs? – Eugen Rieck May 13 '13 at 23:34
  • 1
    I have edited your title. Please see, "[Should questions include “tags” in their titles?](http://meta.stackexchange.com/questions/19190/)", where the consensus is "no, they should not". – John Saunders May 13 '13 at 23:34

2 Answers2

1

ORIGINAL SOLUTION

Fisher–Yates shuffle

public static void Shuffle<T>(this IList<T> list)  
{  
    Random rng = new Random();  
    int n = list.Count;  
    while (n > 1) {  
        n--;  
        int k = rng.Next(n + 1);  
        T value = list[k];  
        list[k] = list[n];  
        list[n] = value;  
    }  
}

List<Product> products = GetProducts();
products.Shuffle();
Community
  • 1
  • 1
Jonas T
  • 2,989
  • 4
  • 32
  • 43
  • 9
    Dude, if you're going to copy code from another answer verbatim, *at least* [link to it](http://stackoverflow.com/a/1262619/369)! – Blorgbeard May 13 '13 at 23:40
1

I think shuffling is a better answer, but an answer to your specific question would be a Hashtable. You would add your string url as the key and null for value. The Keys property will return the strings in the order of where they happened to be placed in the hash table, which will be fairly random since the strings' hashcodes and collision handling will result in the order not well correlated to the sorted order of the string values themselves.

Dictionary and HashSet won't work the same way. Their internal implementation ends up returning items in the order they were added.

Although this is how Hashtable actually works, you'd be counting on an internal implementation detail, which has its potential perils. That's why I prefer just shuffling.

  • So, HashSet is implemented by a LinkedHashSet? That is good to know, and also does not seem to be part of the documentation. – ILMTitan May 14 '13 at 00:18
  • @ILMTitan - Dictionary & HashSet are implemented using closed addressing - hash buckets that point to single linked lists. All the linked lists are in one array, and the elements of that array are served out in the order they're needed. The enumerators enumerate over that array. So the net result is that you get stuff back in the order you added them (assuming you didn't delete any). – hatchet - done with SOverflow May 14 '13 at 00:40