14

Is there a lock-free & thread-safe data structure that implements IList?

Naturally by lock-free I mean an implementation that makes no use of locking primitives in .NET but rather uses interlocked operations / atomic operations to achieve thread safety... There isn't one, apparently under the concurrent data structures...

Has anyone seen one floating around?

I've seen a java one implemented in amino-cbbs, called LockFreeVector but nothing for .NET so far. Any ideas?

damageboy
  • 2,097
  • 19
  • 34
  • 5
    I presume you mean "lock-free" and thread-safe, since `List` is quite lock-free. You should also clarify what you mean by "lock-free". – John Saunders Feb 21 '11 at 19:47
  • 1
    @damageboy Another note: They (amino) are implementing a LINKED list, not a list. A LinkedList in C# doesn't implement IList/IList. They have a LockFreeVector... But I don't think it's "fully" lock free. – xanatos Feb 21 '11 at 22:00
  • 1
    How "full" an implementation are you looking for? I highly doubt you'll be able to find, e.g., a type that supports a random `Insert` without locking (unless you allow *spinning*, I guess, since that isn't really the same as "locking"). But then, what do I know? – Dan Tao Feb 21 '11 at 23:35
  • @xantos: I was referring to the LockFreeVector of amino: http://amino-cbbs.sourceforge.net/java_apidocs/org/amino/ds/lockfree/LockFreeVector.html – damageboy Feb 23 '11 at 19:17
  • @damageboy: Note that the lock-free structure described in the paper on which `LockFreeVector` is based does *not* have a random `Insert` operation (which is part of the `IList` interface). – Dan Tao Feb 23 '11 at 19:26
  • @Dan Tao: You are absolutely correct, if you read my comment below where I replied to Valentin Kuzub you'll see that it perfectly fits into the only use-case where a LockFreeVector still makes sense at all... – damageboy Feb 24 '11 at 14:35
  • "Lock free" means at least one thread makes progress at one time, (for example, the other threads would have to loop and retry compare/swap operation). "Wait free" means all threads always make progress. A lock free version will get you most of the speed increase you're going to get. It is exceedingly difficult to come up with *wait* free algorithms that don't have severe restrictions in one sense or another. – doug65536 Jan 17 '13 at 00:25

3 Answers3

7

ConcurrentList implementing IList might be missing in Collections.Concurrent namespace because of whole Parallel.For and Parallel.ForEach class-methods. One can say that they can be used to handle any list as Concurrent, in order to quickly enumerate through the list and perform actions on its items.

Maybe by not providing ConcurrentList they meant or thought that if Parralel.For cannot help one would require to use not a IList but some other kind of collection like a stack or queue or even Bag or even Dictionary

I would agree with this design, because having to deal with indexable collection under multi thread conditions sounds like very error prone and bad design. Whats the purpose of knowing index of an item if collection can be modified at any time and index would be invalidated, in such circumstances when there are multiple readers - writers its pretty clear to me that Queue or Stack will be commonly best fitting collections, or Bag can be good too. Dictionary can be used also because its indexes are not invalidated by adding items to collection, and if you need parallel access to List you got your Parralel.For methods

What I find really weird - http://msdn.microsoft.com/en-us/library/dd381935.aspx here we can read about ConcurrentLinkedList class but I cannot find it in System.dll , only Bag and BlockingCollection are there.

I would also say that there is like 95% chance at least that either of two is true about your problem

  1. Parallel class methods are better than ConcurrentList
  2. Existing Concurrent collections are going to be better collections than ConcurrentList

I would also say that by not providing ConcurrentList they have saved developers who would mistakenly choose ConcurrentList to solve their problems from making many errors and saved them a lot of time forcing developers to use existing Concurrent collections.

Valentin Kuzub
  • 11,703
  • 7
  • 56
  • 93
  • the page you site says that the ConcurrentLinkedList is to be removed before the release of VS 2010 and not to use it. I agree with everything you wrote - nice job +1. – phillip Feb 22 '11 at 12:38
  • 1
    While there are many points where I definitely do agree with you, I can still see a need for a thread-safe IList much like the LockFreeVector that is provided in amino-cbbs... The main use-case, which is coincidentally the one I'm after, is for when you have an append-only modification of a List from one/more thread(s) while a few other threads may need to go over the list in read-only fashion up to the last-known .Count value when they started the enumeration / processing... IOW, I'm looking for read-only performance that is equivalent to an array, while still being able to append – damageboy Feb 23 '11 at 19:21
7

Well, I couldn't find such a class anywhere; so I gave it a shot.

The source code for my ConcurrentList<T> class is available on GitHub.

It is lock-free, thread-safe (I think, based on my unit tests), and implements IList<T>.

It does not support Insert, RemoveAt/Remove, or Clear.

I was pleased to discover that my implementation (which I came up with independently) is very similar to that of a data structure published by some well-respected minds within the world of software.

For a fairly brief discussion of the implementation itself, see my recent blog post about it.

At the moment, it is not documented at all, which is kind of bad considering how "tricky" some of the code is :(

And by all means, rip me a new one if you take a look and find bugs or other issues.

Anyway, it might be worth your time to check it out. If you do, let me know what you think.

Dan Tao
  • 125,917
  • 54
  • 300
  • 447
  • +1 for going ahead and implementing it on your own. I'm also in the process of porting the java version into c#... I read your blog post, and I'm still going to run your tests on my machine, but the main benefit is for multiple reader threads with a single writer/appender thread. That is where you should expect to see a performance boost as the plain dumb "use lock()" solution would still lock for every read... – damageboy Feb 26 '11 at 19:52
  • BTW: The amino-cbbs version of LockFreeVector is also based on the same paper you referenced: http://amino-cbbs.svn.sourceforge.net/viewvc/amino-cbbs/trunk/amino/java/src/main/java/org/amino/ds/lockfree/LockFreeVector.java?revision=492&view=markup – damageboy Feb 26 '11 at 20:05
  • I like you implementation a lot. I've made some improvements to it, which I will upload to github so you could also take a look at them as well. most of the changes I've made are around having more logical initial array sizes, e.g. 8,16,32... instead of 1,2,4 (a bit wasteful IMO), and around improving your LOG2 function into something completely horrific that should run in less cycles... – damageboy Feb 26 '11 at 23:57
  • Is there a volatile keyword in C#? the target of a CAS needs to be volatile in C, because it's a run-time decision whether or not the CAS succeeds. –  Mar 03 '11 at 16:27
  • Similarly, how do you ensure alignment? CAS on x86/x64 requires word aligned data; non-word aligned causes an exception. –  Mar 03 '11 at 16:29
  • @Blank: In .NET, CAS is achieved using the `Interlocked.CompareExchange` method, which only provides overloads for types on which it is legal: 32- and 64-bit primitives (`int`, `long`, `double`, etc.) and reference types (which are either 32- or 64-bit references). – Dan Tao Mar 03 '11 at 16:35
  • @Blank: There actually *is* a `volatile` keyword in C#, but it doesn't mean the same thing as in C. – Dan Tao Mar 03 '11 at 16:35
  • @Blank: regarding volatile/compare-exchange: If you take a look @: https://github.com/damageboy/ConcurrentList/blob/master/ConcurrentList/ConcurrentList.cs (which is my version of @Dan Tao's code, you will see that I've added comments with the actual x64 assembly created for the use of Interlocked.Increment / Interlocked.CompareExchange, it is clear that MS directly generates cmpxchg and lock xadd instructions that take a memory address. If you read the Intel literature you will see that intel promises these instructions to my synchronous across the memory bus and other CPU caches. – damageboy Mar 04 '11 at 15:49
  • @Blank: regarding alginment, this is actually a very good point. On modern processors you don't have too many alignment requirements. (I answered this elsewhere: http://stackoverflow.com/questions/881820/interlockedexchange-and-memory-alignment/5178935#5178935). The .NET framework seems to ensure (still looking for official info on this) pointer alignment. This means that on x86 you'll get 4 byte alignment and on x64 8 byte alignment, The code in question (the original one by Dan Tao, and my updated version) both require up to pointer size alignment. – damageboy Mar 04 '11 at 16:03
  • @damageboy: Have you read [my more recent blog post](http://philosopherdeveloper.wordpress.com/2011/02/28/sometimes-the-best-strategy-is-to-retreat/) on this subject, or the [`SynchronizedList`](https://github.com/dtao/PhilosopherDeveloper/blob/master/ConcurrentList/ConcurrentList/SynchronizedList.cs) class I added to GitHub? We should talk on Skype some time! – Dan Tao Mar 04 '11 at 16:04
  • @damageboy: re CAS - I'm not sure we understand each other. My point is this; the compiler, when faced with CAS, cannot know the result of the exchange, because it is information only available at run-time. However, the compiler *does not know this unless you mark the destination operand with volatile*, giving the potential for optimizer induced bugs. –  Mar 04 '11 at 17:41
  • @Dan, @damageboy: Concurrent programming really isn't my forté so I might be wrong on this, but it looks to me like the indexer get/set and `CopyTo` methods in your implementations are susceptible to torn reads/writes if `T` is a large `struct`. They're reading/writing the array elements without any kind of synchronisation, which means no guarantee of atomicity if `T` is large. – LukeH Mar 04 '11 at 22:39
  • @LukeH: You're correct about atomicity not being guaranteed in for anything basically larger than a 128-bits in the Intel architecture, and even 64-bits in case of .NET, as the JIT does not generate cmpxchg16b instructions AFAIK. If someone were to store a struct in the ConcurrentList/SynchronizedList that is beyond 8 bytes, then he/she are basically b0rked. More over that: if, for some-reason, the memory is not aligned to 8 bytes, i.e. (&(_array[x]) & (8-1) != 0) || (&(_array[x][y]) & (8-1) != 0), then even those writes won't be always atomic. I'll update the code to test for this. – damageboy Mar 05 '11 at 09:01
  • @LukeH: You're definitely right. I struggled with two different possible approaches to this: 1. Implement a factory method which *only* constructs instances of `ConcurrentList` for `T` where there is a corresponding overload for `Interlocked.Exchange` (so basically, the primitive types plus all reference types); 2. Instead of a `T[][]` array internally, use a `Slot[][]` where `Slot` is a reference type and can therefore be atomically assigned using `Interlocked.Exchange`. In the end I basically went with the lazy solution and did neither ;) – Dan Tao Mar 06 '11 at 18:51
  • @damageboy: if for single-word CAS memory is not aligned to 8 bytes, the CAS instruction will fault. –  Mar 08 '11 at 15:29
  • @Blank: The `Interlocked` class in .NET provides a CAS-based `CompareExchange` method with overloads for `float`, `double`, `int`, `long`, `IntPtr`, and all reference types, all of which AFAIK will not be susceptible to alignment issues in the CLR. There is no overload that accepts user-defined value types. – Dan Tao Mar 09 '11 at 01:22
  • @Blank: Can you please find a source for your alignment claims? The only thing I'm finding that is up to date in intel software architecture is http://stackoverflow.com/questions/881820/interlockedexchange-and-memory-alignment/5178935#5178935 (I've already answered this on some other thread) – damageboy Mar 09 '11 at 20:32
  • Source is me. I've written a lock-free library in C - www.liblfds.org. If the CAS operands are not aligned on the system pointer length, the binary crashes. This is mainly a problem when the pointers in question are in structures (and so end up with unaligned addresses). –  Mar 10 '11 at 18:19
  • If a method is lock-free, that means that if one thread starts invoking a method and gets waylaid during its execution that will not affect other threads. I haven't looked at your particular implementation, but many methods that use "eventual" consistency may never achieve a consistent state if the execution of an update method gets waylaid. An alternative approach, if there is some value which will never legitimately appear in the list (e.g. items are not allowed to be `null`), is to have `Count` hold the largest list slot which is known to have been assigned, and have `Add`... – supercat Nov 19 '12 at 19:51
  • ...try to `CompareExchange` the new element into that slot. If the operation fails, try the next higher slot, etc. Then use a `CompareExchange` loop to update `_Count`, but give up if `_Count` gets to be higher than the value it's trying to store. – supercat Nov 19 '12 at 19:53
0

Think about how random access (implied by IList<T>) could work in a multithreaded environment. You can't actually really do anything without it being immutable since adding and removing, even if they are atomic, don't prevent thread 1 from removing the item at the index that thread 2 was just about to retrieve. That's why the SyncRoot stuff is gone in .NET 2.0+

Mark Sowul
  • 10,244
  • 1
  • 45
  • 51
  • I have though about and I have answered this point when Valentin Kuzub raised it in his comment. The only useful scenario is for a .Add() only IList, where Remove/Insert are not applicable/allowed. This can and DOES in fact improve reader performance greatly while still allowing thread-safe append operations. – damageboy Feb 26 '11 at 20:00