3

Previously, I had code like this:

  EnterCriticalSection(q^);
  Inc(global_stats.currentid);
  LeaveCriticalSection(q^);

and I changed it to:

InterlockedIncrement(global_stats.currentid);

and I found out there are some code like this:

  EnterCriticalSection(q^);
  if (global_stats.currentid >= n) then
  begin
    LeaveCriticalSection(q^);
    Exit;
  end;
  LeaveCriticalSection(q^);

So, question is , can I mix and match InterLockedIncrement and Enter/Leave CriticalSection?

which has a faster performance? critical and atomic?

Kromster
  • 7,181
  • 7
  • 63
  • 111
justyy
  • 5,831
  • 4
  • 40
  • 73

2 Answers2

4

Can I mix and match InterLockedIncrement and Enter/Leave CriticalSection?

In general, no you cannot. Critical sections and atomic operations do not interact.

Atomic functions, like your call to InterLockedIncrement, operate completely independently from critical sections and other locks. That is, one thread can hold the lock, and the other thread can at the same time modify the protected variable. Critical sections, like any other form of mutual exclusion, only work if all parties that operate on the shared data, do so when holding the lock.

However, from what we can see of your code, the critical section is needless in this case. You can write the code like this:

// thread A
InterlockedIncrement(global_stats.currentid);

....

// thread B
InterlockedIncrement(global_stats.currentid);

....

// thread C
if global_stats.currentid >= n then
  Exit;

That code is semantically equivalent to your previous code with a critical section.

As for which performs better, the original code with the lock, and the code above without, the latter would be expected to perform better. Broadly speaking, lock free code can be expected to be better than code that uses locks, but that's not a rule that can be relied upon. Some algorithms can be faster if implemented with locks than equivalent lock-free implementations.

David Heffernan
  • 601,492
  • 42
  • 1,072
  • 1,490
  • In general, it depends on the usage. In this case, with the code that we can see, you are better to remove the lock and use atomic increments. – David Heffernan Oct 02 '15 at 10:46
  • Disagree completely. Second thread might never see the updates to `currentid`! I is saddens me to see the incorrect answer accepted. – SergeyA Oct 02 '15 at 13:14
  • @SergeyA Second thread will see the updates with the Delphi compiler. Global writes in one thread are visible in another. – David Heffernan Oct 02 '15 at 13:22
  • @DavidHeffernan, how is it possible? If you are saying that global variables in Delphi are atomic by definition, than no interlockedincrement is needed as well. If it is needed, I can't see how the visibility can be ensured. – SergeyA Oct 02 '15 at 13:24
  • 1
    @SergeyA InterlockedIncrement is to avoid the classic race on the increment. If there are only two threads then yes, you don't need it because there's only one writer. But if there are two writing threads and one reading thread then you need to avoid the race between the two writing threads. As for the reading thread, it doesn't matter whether it reads the value before a write, or after a write, so long as it can see the modifications. The semantics are unchanged from the version with the lock. – David Heffernan Oct 02 '15 at 13:27
  • @DavidHeffernan, i know nothing of Delphi (should have stayed from this post, but I didn't on the account of Friday) but what you are saying sounds really strange to me. The ONLY way for Deplphi compiler to guarantee visibility from one thread to another is to produce atomic chip-specific instructions upon accessing the global variables from Delphi code. But if Delphi does this (it can, why not?), than there can be no data race and all incremements to global variables would be atomic as well... Confused. – SergeyA Oct 02 '15 at 13:34
  • @SergeyA The x86 strong memory model guarantees that writes are visible. The only scenario for writes not to be visible, on x86, is for the compiler to enregister the value. But that doesn't happen for globals. – David Heffernan Oct 02 '15 at 13:37
  • @DavidHeffernan, but what if it disappears with the next chip release (there is a certain notion it should go away, it slows stuff unneccessary). What if Delphi is ported to ARM8 or Power? Looks like you are giving bad advice after all. – SergeyA Oct 02 '15 at 13:59
  • @SergeyA Delphi is ported to ARM8, but `EnterCriticalSection` doesn't run there, so it's fair to assume that this asker is considering Windows on x86. If the x86 memory model changes, then all programs everywhere break. – David Heffernan Oct 02 '15 at 14:03
  • @DavidHeffernan, not at all. Only the bad programms written under influence of bad avisers would break. Good programms which follow the proper way of writing multithreading programms will blossom. That would be actually a great event, as I think about it. – SergeyA Oct 02 '15 at 14:06
  • 1
    @SergeyA OK, you are entitled to your opinion, and I respect it. However, I think that it is reasonable to rely on the x86 memory model, if targeting x86. – David Heffernan Oct 02 '15 at 14:07
  • does it also work if I compile code to x64 in Delphi? – justyy Oct 02 '15 at 14:56
  • 1
    Yes, x64 has the same memory model as x86 – David Heffernan Oct 02 '15 at 14:58
  • 1
    Instead of using `if global_stats.currentid >= n then`, I prefer to use `if InterlockedExchangeAdd(global_stats.currentid, 0) >= n then`. – Remy Lebeau Oct 02 '15 at 20:34
  • 1
    @DavidHeffernan: I don't trust that. It is not *guaranteed* to be an atomic read, if the variable is not properly aligned, or multiple CPUs access it at the same time. I prefer to use the interlocked API for **both** reads and writes. – Remy Lebeau Oct 02 '15 at 21:03
  • @Remy reading is guaranteed to be atomic, that is without tearing, when the variable is aligned. And the interlocked functions require the variable to be aligned. So we have to rely on the variable being aligned. Using an interlocked api for a read is expensive and needless. – David Heffernan Oct 02 '15 at 21:11
  • @Remy Read the Intel doc quotes in this answer: http://stackoverflow.com/a/5178914/505088 Specifically, "The Intel486 processor (and newer processors since) guarantees that the following basic memory operations will always be carried out atomically: ... Reading or writing a doubleword aligned on a 32-bit boundary" – David Heffernan Oct 02 '15 at 21:25
  • 1
    @DavidHeffernan: On the other hand, that link also says: "Accesses to cacheable memory that are split across cache lines and page boundaries are **not guaranteed to be atomic**." Which goes back to the "multiple CPUs access it at the same time" issue. – Remy Lebeau Oct 02 '15 at 22:48
  • @Remy No. An aligned doubleword is not split across cache lines nor page boundaries. The whole point of an atomic operation is that it refers to behaviour when multiple CPUs access. – David Heffernan Oct 03 '15 at 06:13
  • Very difficult to imagine what you think atomic means, Hard to imagine how else you could be interpreting "The Intel486 processor (and newer processors since) guarantees that the following basic memory operations will always be carried out atomically: ... Reading or writing a doubleword aligned on a 32-bit boundary" – David Heffernan Oct 04 '15 at 09:42
2

No, in general you cannot.

Critical section is used to assure that of all protected blocks of code at most one is executing at a given moment. If such protected block accesses currentid and that variable is modified at another place, code may work incorrectly.

In a specific case, it may be OK to mix&match but then you would have to check all affected code and rethink the processing so you'll be sure nothing can go wrong.

gabr
  • 26,580
  • 9
  • 75
  • 141