Whats the Best Concurrent Thread Shared Memory Architecture Without Locking?

Question

I have a 2d Array of memory. I have multiple threads reading and writing to single elements in the array spontaneously, arbitrarily, and concurrently.

What is the fastest way or best practice to construct my memory access code? I don't like the idea of locking because it blocks other threads.

Data integrity is actually not that important, but it should be (mostly) consistent. My code can handle a few memory errors.

It needs to be really, really fast!

Thanks for feedback.

What are the elements? Let me guess - they don't fit into a native type :(( — Martin James, Oct 14 '11 at 15:27
@Eric, int. -Everyone, Thanks for the info. I have some reading to do. — extracrispy, Oct 14 '11 at 15:43
I know this is for C#, but this question in C++ posed some interesting aspects of x86 class processors and atomic reads and writes to integer types on natural boundires http://stackoverflow.com/questions/5002046/atomicity-in-c-myth-or-reality. These certainly don't hold for CLR memory, but to me it begs the question if speed is your goal why use C#? — Damon8or, Oct 14 '11 at 16:08

Vlad · Answer 1 · 2011-10-14T15:30:42.953

2

If data integrity is not important, you can just access the data without caring about multithreading at all.

No one can predict the result, though.

I wouldn't call this approach "best practice", however. IMHO best pratice is caring about multithreading, and protecting the data with appropriately-grained mutexes. My opinion is that every application should be first correct, and only then fast. Inconsistent results are just wrong, doesn't matter if they come fast or not.

edited Oct 14 '11 at 15:30

answered Oct 14 '11 at 15:25

Vlad

35,022
6
77
199

How bad would the errors be? If a thread is reading the array element while it is being written to, I will get the new value right? If so, this is good, at least for my algorithm. – extracrispy Oct 14 '11 at 15:34
1

@extracrispy: what is "right"? If you are changing the whole row, and the change is interrupted by a reading thread, which reads half-changed-half-outdated data, is this right or wrong? – Vlad Oct 14 '11 at 15:35
@extracrispy: the thing is that the values changed in one thread can become visible in some other thread later, when the processor cache is flushed. You can make the processor cache flush explicitly my using memory barriers, or applying mutexes (they include raising memory barriers). But "consistency" of a single item doesn't mean the overall consistency. – Vlad Oct 14 '11 at 15:38
@extracrispy "That's what I wanted. lol."... I'm not sure if you got it: memory barriers are things like lock, critical section, monitor.enter/exit, etc. All of the memory barriers block the other threads, which is what you initially wanted to avoid. So I'm not sure what has changed since your original post. – Kiril Oct 14 '11 at 15:53

Kiril · Accepted Answer · 2011-10-14T17:43:23.603

0

Use the Interlocked class to CAS (CompareAndExchange) the objects/values in your array. It makes the operation atomic which ensures that the data is not corrupted. That's about the fastest thing you can do (aside from accessing/modifying the data directly without interlocking). However, if you're modifying the size of the 2D array (growing/shrinking) then you will have some serious problems unless you use some kind of locking mechanism on your array.

edited Oct 14 '11 at 17:43

answered Oct 14 '11 at 15:30

Kiril

39,672
31
167
226

`Interlocked` uses locking mechanisms internally to prevent more than one thread from getting into the code that will read/write the value, which the OP said he doesn't want. However, I'm sure what the OP thinks he wants will come back to bite him so your answer is correct. – KeithS Oct 14 '11 at 15:34
@KeithS, it does use a lock, but the CAS is essentially a single CPU instruction, so it's much faster than locking in general. – Kiril Oct 14 '11 at 15:49

score 0 · Answer 3 · answered Oct 14 '11 at 15:33

Declare the array as volatile and ensure it's scoped such that it's visible to all your threads. I generally like to avoid statics, so either pass the array by reference, or set up all your threads to run methods of an instance class that has the array defined as an instance field.

However, I strongly urge you to rethink what "volatile access" means in terms of data integrity. Best practice is NOT to do what you are attempting without good locking mechanics. You may think it's a small problem, but you can find yourself with a very non-deterministic system, so much so that its data isn't reliable in the slightest.

Let's say you have 8 threads running, and all of them will get a value from an index of the array, do some calculation, then add the result back to the index of the array. Thread 1 starts first and gets the value of the index, 0. Then threads 2-7 all start and get the same value. Thread 1 performs its calculation, gets the index again to ensure it has the "latest" value, then tries to update the value. However, other threads are waiting for that memory, and due to some scheduling implementation you know nothing about, in between Thread 1 getting the index (still zero) and writing its result, threads 2-7 have ALL written their values. Then Thread 1 writes its value, overwriting everything the other 7 threads have done. The other 7 threads, in turn, probably had similar "races" with each other such that the value overwritten by Thread 1 probably overwrote the results of half the threads anyway.

I guarantee you that this behavior is NOT what you want, no matter how much you think you can get away with it; it WILL cause data corruption, which WILL affect other areas of the system, and you WILL be forced to implement proper locking.

Just because the variable that holds a reference to an array is volatile does not mean that every variable in the array is also volatile. — Eric Lippert, Oct 14 '11 at 15:44
Just a slight note, if you pass a `volatile` field as `ref`, the field will lose its volatility. — aevitas, Dec 17 '14 at 11:16

score 0 · Answer 4 · answered Oct 14 '11 at 15:36

If you are interested solely in performance, then the way in which you order your memory accesses can play a big role. Spend an hour or so reading through the slides from Lecture 1 of MIT's Performance Engineering class. The other lectures may also be interesting to you (such as Lecture 6).

Basically, you can optimize your use of the cache to greatly improve performance, depending on your read/write patterns, given the workload you are using.

This should not stop you from doing something that is correct, however.

Whats the Best Concurrent Thread Shared Memory Architecture Without Locking?

4 Answers4