8

I have found this code for compareAndSwap in a StackOverflow answer:

boolean CompareAndSwapPointer(volatile * void * ptr,
                              void * new_value,
                              void * old_value) {
#if defined(_MSC_VER)
   if (InterlockedCompareExchange(ptr, new_value, old_value) == old_value) return false;
   else return true;
#elif (__GNUC__ * 10000 + __GNUC_MINOR__ * 100 + __GNUC_PATCHLEVEL__) > 40100
   return __sync_bool_compare_and_swap(ptr, old_value, new_value);
#else
#  error No implementation
#endif
}

Is this the most proper way of having portable fast code, (Except assembly inlining).

Also, one problem is that those specific builtin methods have different parameters and return values from one compiler to another, which may require some additional changes like the if then else in this example.

Also another problem would be the behavior of these builtin methods in the machine code level, do they behave exactly the same ? (e.g use the same assembly instructions)

Note: Another problem would be if there is many supported platforms not just (Windows and Linux) as in this example. The code might get very big.

Community
  • 1
  • 1
Bionix1441
  • 2,135
  • 1
  • 30
  • 65

3 Answers3

3

I would use a Hardware Abstraction Layer, (HAL) that allows generic code to be common - and any portable source can be included and build for each platform.

In my opinion, this allows for better structured and more readable source.

To allow you to better understand this process I would suggest Google for finding examples and explanations.

Hopefully this brief answer helps.

[EDIT] I will attempt a simple example for Bionix, to show how to implement a HAL system...

  • Mr A wants his application to run on his 'Tianhe-2' and also his 'Amiga 500'. He has the cross compilers etc and will build both binaries on his PC. He want to read keys and print to the screen.

mrAMainApplication.c contains the following...

#include "hal.h"

// This gets called every time around the main loop ...
void mainProcessLoop( void )
{
   unsigned char key = 0;

   // scan key ...
   key = hal_ReadKey();

   if ( key != 0 )
   {
       hal_PrintChar( key );
   }
}

He then creates a header file (Remember - this is an example, not working code! )... He creates hal.h ...

#ifndef _HAL_H_
#define _HAL_H_

unsigned char hal_ReadKey( void );
unsigned char hal_PrintChar( unsigned char pKey );

#endif // _HAL_H_

Now Mr A needs two separate source files, one for his 'Tianhe-2' system and another for his Amiga 500...

hal_A500.c

void hal_ReadKey( void )
{
    // Amiga related code for reading KEYBOARD
}

void hal_PrintChar( unsigned char pKey )
{
    // Amiga related code for printing to a shell...
}

hal_Tianhe2_VERYFAST.c

void hal_ReadKey( void )
{
    // Tianhe-2 related code for reading KEYBOARD
}

void hal_PrintChar( unsigned char pKey )
{
    // Tianhe-2 related code for printing to a shell...
}

Mr A then - when building for the Amiga - builds mrAmainApplication.c and hal_A500.c When building for the Tianhe-2 - he uses hal_Tianhe2_VERYFAST.c instead of hal_A500.c

Right - I've written this example with some humour, this is not ear-marked at anyone, just I feel it makes the example more interesting and hopefully aids in understanding.

Neil

Neil
  • 1,036
  • 9
  • 18
  • I have searched in Google, but all I find is general information about how it works. Could you please give me a link to a small example using `HAL` concept, is it implemented using `C `or `C++` – Bionix1441 Jul 02 '15 at 12:33
  • 1
    I'm sorry Bionox1441, but I'm at work - and we have restricted access to the internet. I've noticed (http://stackoverflow.com/questions/12700909/simple-example-to-illustrate-writing-an-abstraction-layer-in-c) this might at least give you a better understanding of the principal. I will look tonight for better examples if this doesn't help. – Neil Jul 02 '15 at 13:11
  • If you could please give me an example, whenever you have time – Bionix1441 Jul 06 '15 at 15:02
  • 1
    Here's one example - (https://community.particle.io/t/what-is-hardware-abstraction-layer-and-and-how-to-implement-it/10590). It's not great - but it saves me attempting to explain it. If this is still unclear - then I will edit my answer for you Bionix, thanks - Neil – Neil Jul 07 '15 at 14:41
  • 1
    I've edited my answer for you Bionix - it's very simple and light-hearted - but should explain how the platform dependent sources are used with common code. Thanks... – Neil Jul 07 '15 at 15:09
1

Take a look at ConcurrencyKit and possibly you can use higher level primitives which is probably what most of the time people really want. In contrast to HAL which somewhat OS specific, I believe CK works on Windows and with a number of non-gcc compilers.

But if you are just interested in how to implement "compare-and-swap" or atomic actions portably on a wide variety of C compilers, look and see how that code works. It is all open-source.

I suspect that the details can get messy and they are not something that in general will make for easy or interesting exposition here for the general public.

rocky
  • 7,226
  • 3
  • 33
  • 74
  • The ConcurrencyKit is very specific to this concurrent cases, but my question before it was edited by a user is general not only for compare and swap. I used as a clarifying example for the problem I currently have – Bionix1441 Jul 02 '15 at 11:22
  • 1
    See the revised answer. – rocky Jul 02 '15 at 12:43
1

In modern C, starting with C11, use _Atomic for the type qualification and atomic_compare_exchange_weak for the function.

The newer versions of gcc and clang are compliant to C11 and implement these operations in a portable way.

Jens Gustedt
  • 76,821
  • 6
  • 102
  • 177
  • is the `atomic_compare_exchange_weak` method as fast as `__sync_bool_compare_and_swap`. The former is a normal function, while the latter is a `builtin` method in gcc, so I assume the second one is faster than the first. – Bionix1441 Jul 02 '15 at 12:45
  • Microsoft doesn't use modern C - this wouldn't help. – Neil Jul 02 '15 at 19:18
  • @Neil, I think it does. It reduces the number of cases of exotic platforms for which you have to code an exception to 1, that's the idea of standards. Also, there are implementations of `` for MS, the whole interface orignially was invented for a library on windows. – Jens Gustedt Jul 03 '15 at 13:28
  • @Bionix1441, yes it will be as fast, remember that these are the same people implementing this, your compiler implementors. And your are mistaken, this is usually not a function but a macro that resolves to some magic. (All C library functions may in fact be implemented as macros.) – Jens Gustedt Jul 03 '15 at 13:31
  • @Jens aren't these https://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html faster than normal ones. I thought these builtin methods were similar to macros expansion in terms of performance – Bionix1441 Jul 03 '15 at 13:36
  • @Bionix1441, what do you mean by "normal ones"? As I said, the `atomic_compare_exchange*` interfaces are macros that resolve to similar things as these builtin functions. As a user of these things, this should not be your concern. Just think of it, a compiler implementor will do all that he can to have those operations efficient. Since the gcc people know how to do the `__buitlin` stuff, you can be sure that they also know how to implement the standard interface. – Jens Gustedt Jul 03 '15 at 14:20
  • @Bionix1441, also you seem to have a false idea of efficiency, here. The real constraint here in terms of time is the bus transfer which may be up to some hundred clock cycles. So an atomic operation is never "efficient" in an absolute sense, but only relatively more efficient than locking a critical section with a mutex, for example. – Jens Gustedt Jul 03 '15 at 14:22