1

I have the following code in C#: (_StoreQueue is a ConcurrentQueue)

        var S = _StoreQueue.FirstOrDefault(_ => _.TimeStamp == T);
        if (S == null)
        {
            lock (_QueueLock)
            {
                // try again
                S = _StoreQueue.FirstOrDefault(_ => _.TimeStamp == T);
                if (S == null)
                {
                    S = new Store(T);
                    _StoreQueue.Enqueue(S);
                }
            }
        }

The system is collecting data in real time (fairly high frequency, around 300-400 calls / second) and puts it in bins (Store objects) that represent a 5 second interval. These bins are in a queue as they get written and the queue gets emptied as data is processed and written.

So, when data is arriving, a check is done to see if there is a bin for that timestamp (rounded by 5 seconds), if not, one is created.

Since this is quite heavily multi-threaded, the system goes with the following logic:

If there is a bin, it is used to put data. If there is no bin, a lock gets initiated and within that lock, the check is done again to make sure it wasn't created by another thread in the meantime. and if there is still no bin, one gets created.

With this system, the lock is roughly used once every 2k calls

I am trying to see if there is a way to remove the lock, but it is mostly because I'm thinking there has to be a better solution that the double check.

An alternative I have been thinking about is to create empty bins ahead of time and that would entirely remove the need for any locks, but the search for the right bin would become slower as it would have to scan the list pre-built bins to find the proper one.

Thomas
  • 10,933
  • 14
  • 65
  • 136
  • 1
    `System.Collections.Concurrent.ConcurrentQueue` doesn't work? There is already a concurrent queue data structure. – Daniel Mann Jun 27 '19 at 17:46
  • It is a concurrent queue, but the problem is that if 2 threads try to write at the same time, one may be creating a bin while the other one's bin check fails (and it will create a bin too); I just edited the question to state that this is a ConcurrentQueue because that wasn't clear. – Thomas Jun 27 '19 at 17:47
  • You could use a ConcurrentQueue, but not aggregating the data in Store before queuing them. If you let the consumer of the queue data aggregate the data into 5 second data packets, your locking issue would go away. (Assuming the amount of data you collect is not causing undue memory pressure) –  Jun 27 '19 at 17:52
  • 2
    Why are you even using a `ConcurrentQueue` ? a `ConcurrentDictionary` would literally fix your problem and only use 1 line. The key would be the value type of the `Store.TimeStamp` and the value would be the `Store` object – Franck Jun 27 '19 at 17:55
  • the solution from elgonzo would work, but it requires quite a bit of change everywhere. Wouldn't the dictionary cause a similar problem: one thread doesn't find a key, creates the store and, before it's created and added to the dictionary, another thread doesn't find the key and make its own as well? – Thomas Jun 27 '19 at 18:00
  • @Thomas If you use the [GetOrAdd](https://learn.microsoft.com/en-us/dotnet/api/system.collections.concurrent.concurrentdictionary-2.getoradd?view=netframework-4.8) it should work just fine – Franck Jun 27 '19 at 18:04
  • @Franck, I didn't know that method, but that will do! thanks a lot! – Thomas Jun 27 '19 at 18:05
  • I second [ConcurrentDictionary.GetOrAdd](https://learn.microsoft.com/en-us/dotnet/api/system.collections.concurrent.concurrentdictionary-2.getoradd) method. In case it is mandatory that no `Store` object should be created without been inserted in the dictionary, you can use `Lazy` as values ([Is ConcurrentDictionary.GetOrAdd() guaranteed to invoke valueFactoryMethod only once per key?](https://stackoverflow.com/questions/31637394/is-concurrentdictionary-getoradd-guaranteed-to-invoke-valuefactorymethod-only)). – Theodor Zoulias Jun 27 '19 at 18:07
  • @Thomas i just added a very simple sample answer that covers the basic call – Franck Jun 27 '19 at 18:10
  • Btw @elgonzo has a point because the `ConcurrentQueue` class is [lock free](https://devblogs.microsoft.com/pfxteam/faq-are-all-of-the-new-concurrent-collections-lock-free). If you have dedicated threads for producing data and consuming data, then using a [`BlockingCollection`](https://learn.microsoft.com/en-us/dotnet/api/system.collections.concurrent.blockingcollection-1) (backed up by a `ConcurrentQueue` by default) may be both an efficient and simple to implement and maintain solution. – Theodor Zoulias Jun 27 '19 at 18:28

1 Answers1

3

Using a ConcurrentDictionary can fix the issue you are having. Here i assumed a type double for your TimeStamp property but it can be anything, as long as you make the ConcurrentDictionary key match the type.

class Program
{
    ConcurrentDictionary<double, Store> _StoreQueue = new ConcurrentDictionary<double, Store>();

    static void Main(string[] args)
    {
        var T = 17d;

        // try to add if not exit the store with 17
        _StoreQueue.GetOrAdd(T, new Store(T));
    }
    public class Store
    {
        public double TimeStamp { get; set; }
        public Store(double timeStamp)
        {
            TimeStamp = timeStamp;
        }
    }
}
Franck
  • 4,438
  • 1
  • 28
  • 55
  • One detail worth mentioning is that the `new Store(T)` part (or even a delegate factory method if you use that overflow) can potentially be called twice in a row when two threads are competing. One way to fix it is to use `Lazy` instead of `Store` (or, alternatively use double-check locking), similar to [this thread](https://stackoverflow.com/a/12611341/69809). This, of course, is only a problem if instantiating the object is expensive. – vgru Oct 12 '21 at 19:29