Parallelism and Synchronization

Question

Imagine you are utilizing Parallelism in a multi-core system.

Is it not completely possible that the same instructions may be executed simultaneously?

Take the following code:

int i = 0;

if( blockingCondition )
{
  lock( objLock )
  {
    i++;
  }
}

In my head, it seems that it is very possible on a system with multiple cores and parallelism that the blockingCondition could be checked at precisely the same moment, leading to the lock being attempted at the same moment, and so on...Is this true?

If so, how can you ensure synchronization across processors?

Also, does the .net TPL handle this type of synchronization? What about other languages?

EDIT Please note that this is not about threads, but Tasks and Parallel-Processisng.

EDIT 2 OK, thanks for the information everyone. So is it true that the OS will ensure that writing to memory is serialized, ensuring multi-core synchronization via volatile reads?

Sanjay Manohar · Answer 1 · 2010-10-10T11:26:13.107

2

To understand why this works, bear in mind:

Locking a lock (i.e. incrementing a lock semaphore on the object) is an operation that blocks if the object is already locked.
The two steps of lock, a) checking the lock semaphore is free, b) and actually locking the object, are performed 'simultaneously' - i.e. they are a monolithic or atomic operation as far as relationship between CPU and memory is concerned.

Therefore, you can see that, if 2 threads enter your if-block, one of the two threads will acquire the lock, and the other will block until the first one has finished the if.

edited Oct 10 '10 at 11:26

answered Oct 10 '10 at 11:02

Sanjay Manohar

6,920
3
35
58

@miguel, can you describe the difference between threading and parallelism? – H H Oct 10 '10 at 11:24
http://stackoverflow.com/questions/806499/threading-vs-parallelism-how-do-they-differ – miguel Oct 10 '10 at 11:25
i guess my point was that my question is regarding Tasks on parallel cores, not threads on a single core – miguel Oct 10 '10 at 11:26
1

@miguel: Your OS+Framework+Library aim to make the single/multi core distinction as invisible as possible. – H H Oct 10 '10 at 11:42
This answer is correct with a single core, multiple core, or multiple CPU on most architectures (ccNUMA). The "atomic" natures of the locking is actually done across all the cores. – edA-qa mort-ora-y Oct 10 '10 at 11:47

H H · Answer 2 · 2010-10-10T11:40:50.813

1

Your concern is precisely why we need a special mechanism like lock and cannot simply use a boolean flag.

The solution to your 'simultaneous' problem is in the algorithm that lock (which calls Monitor.Enter()) uses. It involves memory barriers and knowledge of the very lowlevel memory mechanics to ensure that no 2 threads can acquire the lock at the same time.

Note: I'm talking about .NET only, not Java.

edited Oct 10 '10 at 11:40

answered Oct 10 '10 at 11:10

H H

263,252
30
330
514

Henk: Is this the case that Erica is speaking of? i.e. the OS will ensure synchronization of memory/tasks across cores? – miguel Oct 10 '10 at 11:23
@miguel: Funtional, Yes. `Monitor` uses the lighter weight `Interlocked` class, and does not need help form the OS (= expensive) but can guarantee correctness on multiple cores. – H H Oct 10 '10 at 11:34
@Henk: but in a true-parallel environment, surely Interlocked only assures atomicity on the CPU upon which the instruction is executing? – miguel Oct 10 '10 at 11:43
I guess the read/write to the memory location itself might be serialized by the OS? – miguel Oct 10 '10 at 11:46
2

Interlocked.Exchange makes it possible to see if you succeeded. On that you can build a locking mechanism. – H H Oct 10 '10 at 11:53
I'm a Java person, but looking at the .NET documentation, it looks like "Interlocked.Exchange" is part of the OS kernel (Kernel32.dll), not the .NET runtime. I'd guess that the .NET commands are being compiled down to calls to that kernel instruction. – Erica Oct 10 '10 at 23:50

score 1 · Accepted Answer · answered Oct 10 '10 at 11:16

A lock like you have described here is a "Monitor" style lock on the objLock. As you've noted, it is entirely possible, under a multi-core system, for the two "lock" calls to begin simultaneously. However, any high level application environment which uses monitors will have translated the monitor into semaphore requests (or, depending on your OS and language particulars, mutex requests) in the compiled byte code.

Semaphores are implemented at the operating system and/or hardware level, and higher level languages bind to them. At the OS level, they are "guaranteed" to be atomic. That is, any program acquiring a semaphore is guaranteed to be the only one doing so at that point in time. If two programs, or two threads within a program attempt to acquire the lock at the same time, one will go first (and succeed), and the other will go second (and fail).

At this point, the "how do you ensure synchronisation" stops being a problem for the application programmer to worry about, and starts being a problem for the operating system designer and the hardware designer.

The upshot of it is, as an application coder, you can safely assume that "lock(objLock)" will be an atomic call no matter how many CPUs you plug into your system.

So, are you saying that the OS will ensure synchronization of tasks among multiple cores, in a true-parallel environment? — miguel, Oct 10 '10 at 11:21
@ Miguel - Yes. The OS is responsible for making sure that attempts to acquire locks will never happen simultaneously, even in a true-parallel environment. — Erica, Oct 10 '10 at 11:27

Parallelism and Synchronization

3 Answers3