3

I need a data structure with O(log n) retrieval and O(1) lookup by index. So SortedList is the obvious choice.

Now I want to know the best way to make its methods thread-safe? Here is a partial implementation using lock, but I am wondering if there is a more efficient way? For example, if there were an immutable SortedList I could use that, but I can only see SortedDictionary offered as immutable.

public class ConcurrentSortedList<TKey, TValue>
{
    private object _locker = new object();
    private SortedList<TKey, TValue> _list = new SortedList<TKey, TValue>();

    public void Add(TKey key, TValue value)
    {
        lock(_locker)
        {
            _list.Add(key, value);
        }
    }

    public TValue GetByIndex(int index)
    {
        lock(_locker)
        {
            return _list.Values[index];
        }
    }
}

Edit. As pointed out below by @Theodor, getting by index seems useless in a multi-threaded environment, and I agree then that there is no point in trying to make SortedList tread-safe.

James Hirschorn
  • 7,032
  • 5
  • 45
  • 53
  • I think that you right, you need [ImmutableSortedDictionary](https://stackoverflow.com/questions/22546906/concurrent-sortedlist-or-olog-n-concurrent-collection) – kaan Apr 06 '21 at 16:51
  • @kaan But `SortedDictionary` does not have lookup by index. Otherwise, I would have used it. – James Hirschorn Apr 06 '21 at 16:53
  • 2
    Out of curiosity, why do you need the lookup by index? What is your end goal using lookup by index? – Dan Barrett Apr 06 '21 at 17:32
  • @DanBarrett The same reason people need arrays ;) Actually, in my case I just need to extract the maximum element (i.e. the one with the biggest key). – James Hirschorn Apr 06 '21 at 17:35
  • Related to @DanBarrett's question: I see that `SortedList` is not ideal because removing an element is O(n). A modified `SortedDictionary` with the added feature of O(1) lookup of the maximum would be superior. – James Hirschorn Apr 06 '21 at 17:42
  • 1
    Check out SortedSet. It is essentially a red-black tree with O(Log(N)) Add, Remove, and Retrieve operations. In addition, it also keeps track of the Min and Max values. – Dan Barrett Apr 06 '21 at 18:07
  • 1
    There is a reason that no concurrent collection exists that allows getting an item by index. It's because this cannot be part of an atomic operation in a concurrent environment. From the point of view of one thread, the collection may have 1000 elements at one moment, and at the very next moment the collection may be empty. So not only the element with index 0 can be anything, but may not be even there. The concurrent collections are not just wrappers of normal collections. They offer special APIs that guarantee atomicity (like `AddOrUpdate`). – Theodor Zoulias Apr 06 '21 at 18:08
  • @DanBarrett Excellent suggestion, and I can use `ImmutableSortedSet` for thread-safety! – James Hirschorn Apr 06 '21 at 19:49
  • @TheodorZoulias Very interesting. So is the answer to my question no, locking is the only way to get thread-safety? In this case you could put this as the answer to my question. – James Hirschorn Apr 06 '21 at 20:00
  • I think that thread-safety is academic here. There is nothing useful that can be done with a thread-safe wrapper of the `SortedList` IMHO. – Theodor Zoulias Apr 06 '21 at 21:42
  • @TheodorZoulias This is too cryptic to help me understand. I am using this class right now in a multi-threaded context, so I would say it has proved "useful". Do you mean there are other classes which can do it better? – James Hirschorn Apr 07 '21 at 03:03
  • James could you edit the question and add a minimal example of how you use this class? Chances are that there will be correctness issues with your code, or that your code behaves correctly because the pattern of access is not actually concurrent. I may be wrong of course. – Theodor Zoulias Apr 07 '21 at 05:57
  • @TheodorZoulias Did you notice I mentioned this is a partial implementation? The actual class has removal of elements as well. The actual code is much too long, I would need to think up a minimal example. – James Hirschorn Apr 07 '21 at 06:07
  • James you can just post how each thread interacts with the `ConcurrentSortedList` class. We can assume that every unspecified API call delegates to the underlying `SortedList`, with `lock` protection. – Theodor Zoulias Apr 07 '21 at 06:14
  • @TheodorZoulias Not easily. My code subscribes to an external library that triggers concurrently. I did not write the external library. This seems silly though. Are you also saying that nothing useful can be done with a thread-safe array? Or a better example: is `ImmutableSortedSet` useless? Because the *full* implementation can do what a sorted set does. – James Hirschorn Apr 07 '21 at 06:48
  • 1
    James I just can't imagine how the `GetByIndex` method of the `ConcurrentSortedList` class can be useful an a concurrent environment. Assuming that there are other threads that modify concurrently the collection, the `GetByIndex` should not be much different than `GetRandom`. Regarding the `ImmutableSortedSet` class, it's not useless, but it's not concurrent. If you use this class in a concurrent environment, you should synchronize the access to the mutable reference, otherwise you may lose updates. – Theodor Zoulias Apr 07 '21 at 07:11
  • @TheodorZoulias I think I see your point now. But I was actually only getting the last index (maximum element), which is useful. Regarding `ImmutableSortedSet` my understanding is that reads are atomic and don't need synchronizing, but modifications should be synchronized. – James Hirschorn Apr 07 '21 at 08:19
  • 1
    James how do you know the last index of the collection? You can't just assume that the `Count` property you read a moment earlier is still valid. It would be OK if you had an API like `GetLast`, that would both read the `Count` and fetch the `_list.Values[lastIndex]` inside the protected region. But the `GetByIndex` API is meaningless IMHO. – Theodor Zoulias Apr 07 '21 at 08:58
  • Regarding the `ImmutableSortedSet`, nope, you can't just access a non-volatile reference that is modified by multiple threads concurrently, and assume that you got the latest reference. You may get a stale reference instead. Lock-free programming [is for experts](https://stackoverflow.com/questions/2528969/lock-free-multi-threading-is-for-real-threading-experts/2529773#2529773). – Theodor Zoulias Apr 07 '21 at 09:01
  • @TheodorZoulias I have edited the question based on your explanation. As far as `ImmutableSortedSet`, reading is thread-safe according to the official documentation, and as explained here https://stackoverflow.com/a/22547770/1349673 "you're guaranteed to get the most up to date results you possibly could". If you still disagree then perhaps I should create a new question about this. – James Hirschorn Apr 07 '21 at 14:29
  • 1
    James if you think that lock-free multithreading is a good idea, you may want to take a look at [this article](https://docs.microsoft.com/en-us/archive/msdn-magazine/2012/december/csharp-the-csharp-memory-model-in-theory-and-practice) by Igor Ostrovsky. I bet that after reading only the first two paragraphs, the idea of going lock-free will become less enticing. Another good reading is [this](https://stackoverflow.com/a/66490395/11178549) answer by Peter Cordes. You may appreciate the insane amount of knowledge required before being able to judge correctly the consequences of skipping a lock. – Theodor Zoulias Apr 07 '21 at 15:09
  • @TheodorZoulias Thanks for the references. But I am not attempting lock-free, only for the reads. But I will have read of the articles in case it changes my mind :) – James Hirschorn Apr 07 '21 at 15:32
  • 1
    Well, if you are reading mutable shared state without synchronization, you have entered the realm of lock-free multithreading. Personally I am not a fan of this realm, because I want to be in full control of the code I write. So I use `lock`s, or other mechanisms like the `volatile` keyword, the `Volatile` class or the `Interlocked` class, to ensure that the most recent value of my variable is visible to the current thread. It is easy to do, it doesn't require monumental knowledge about memory models, cache coherency and CPU architectures, and allows me to sleep peacefully at nights. – Theodor Zoulias Apr 07 '21 at 15:51
  • But I thought `ImmutableSortedSet` is *immutable* shared state? – James Hirschorn Apr 07 '21 at 15:55
  • 1
    The immutable collections are immutable, but in order to do anything useful with them you must store them in mutable references. You don't change the collection, but you change the reference so that it points to a different collection. Updating references is atomic, but if you don't protect with a `lock` the region that creates the new (updated) collection and replaces the reference, you may lose updates. – Theodor Zoulias Apr 07 '21 at 16:01

0 Answers0