1
class Program
    {
        static Dictionary<string, int> Dictionary = new Dictionary<string, int>();

        static void Main(string[] args)
        {
            Stopwatch stopwatch = new Stopwatch();
            stopwatch.Start();

            Thread[] threads = new Thread[500];

            for(int i = 0; i < threads.Length; i++)
            {
                threads[i] = new Thread(InsertAlphabet);
                threads[i].Start();
            }
            for (int i = 0; i < threads.Length; i++)
            {
                threads[i].Join();
            }

            Console.WriteLine(Dictionary.Count);

            Console.WriteLine(stopwatch.ElapsedMilliseconds);

            foreach (KeyValuePair<string,int> kvp in Dictionary)
            {
                Console.WriteLine(kvp.Key + " " + kvp.Value);
            }

            stopwatch.Stop();
            Console.ReadLine();

        }

        private static void InsertAlphabet()
        {
            string[] alphabetArray = { "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z" };
            foreach (var alphabet in alphabetArray)
            {
                Add(alphabet);
            }
        }

        public static void Add(string bar)
        {
            lock (Dictionary)
            {
                if (!Dictionary.ContainsKey(bar))
                {
                    Dictionary.Add(bar, 1);
                }
                else
                {
                    Dictionary[bar] += 1;
                }
            }
        }
    }

I have created this simple console application to make sure that the data inserted into the dictionary is accurate.

The time that I took to insert the alphabets as key and the count as the value was approximately 3 seconds for 500 threads trying to insert at the same time.

Is there a way I can improve the performance of this by having some kind of approximation involved (data doesn't need to be 100% accurate. Allowed accuracy 95%).

Also are there suggestions on how can I improve the increment of count in the dictionary.

Eric J.
  • 147,927
  • 63
  • 340
  • 553
StackOverflowVeryHelpful
  • 2,347
  • 8
  • 34
  • 46
  • 4
    Having 500 threads isn't doing you any favors for performance... – Ron Beyer Jul 16 '15 at 19:36
  • Sure i agree. Was just trying and obviously it depends on the cores that my machine would have. Tried it with one thread and with a ConcurrentDictionary implementation and it takes approx 8 ms – StackOverflowVeryHelpful Jul 16 '15 at 19:41
  • 1
    It doesn't even depend on the number of cores, since you are locking on a single resource I don't think you are getting any benefit of using more than one thread here. – Ron Beyer Jul 16 '15 at 19:45
  • 1
    @RonBeyer: That is mostly true (very true for the presented code, depends if there is much computation or IO in a "real" case). However, I presented a solution that should lock at the key level. – Eric J. Jul 16 '15 at 19:56
  • @EricJ. yes, considering the computation or IO is inside the lock it will block all the other threads. In this case most threads suffer from resource starvation and the best course of action is to let threads behave as atomic work units which `ConcurrentDictionary` is pretty elegant at handling. In your case the number of "optimal" threads is either bounded by the number of keys or the number of processor cores I think. – Ron Beyer Jul 16 '15 at 20:00

1 Answers1

3

I believe you can accomplish this safely using the ConcurrentDictionary overload of AddOrUpdate that takes a delegate to generate the new value.

The delegate receives the current value, if there is one. You can provide an implementation of the delegate that adds the incremental value to the existing value. If there was not already a value present, the parameter supplied to AddOrUpdate will be directly assigned as the value for that key.

Since with this solution ConcurrentDictionary internally locks on the key value being updated until your delegate returns and the internal value is updated, multi-threaded performance should be far better than your current locking on the entire dictionary structure.

Eric J.
  • 147,927
  • 63
  • 340
  • 553
  • 1
    Thanks for this. Replaced the Implementation of Add method in the code in the question with just this Dictionary.AddOrUpdate(bar, 1, (s, i) => i + 1); The performance to insert is still around 3 seconds. Which i dont know if i can improve on. – StackOverflowVeryHelpful Jul 16 '15 at 19:59
  • @StackOverflowVeryHelpful hopefully you've reduced the number of threads. Considering you only have 24 keys, I would suggest the "optimal" number of threads is somewhere between 4 and 24. – Ron Beyer Jul 16 '15 at 20:09
  • 1
    If you only have 24 keys and 500 threads competing to update those keys, you will still have lock contention. The threads process the keys in the same order, so you will tend to have everyone contending for "A", then everyone contending for "B", etc. – Eric J. Jul 16 '15 at 20:22