Implementation of a semaphore in C

Question

I am working on a simple implementation of a semaphore in C, and while my implementation is working (as a binary semaphore), I have a question regarding its validity.

My concern stems from my definition of my wait function:

void Wait(int semid) {
    char *shmPtr;

    shmPtr = shmat(semid, NULL, 0);

    if(shmPtr == (void *) -1){
        printf("Could not attach to semaphore...\n");
        exit(1);
    } 

    //Wait for the value in shared memory to 
    //equal 0, then set it equal to 1,
    //detach and return
    while( (*shmPtr) != 0);

    (*shmPtr) = 1;

    if(shmdt(shmPtr) < 0) {
        printf("Cannot detach from semaphore...");
    }

    return;

}

My question lies with the while( (*shmPtr) != 0) loop. Lets say we have 2 processes that are waiting. A third process changes the value of a semaphore to equal 0 (ignore the fact that this is a binary implementation of a semaphore).

My concern is that if process 1 evaluates the condition of the while loop to be false, and then the CPU context switches to process 2 before setting the semaphore value equal to 1, both processes will enter the critical section.

Is there a better way to implement the waiting functionality? I've seen a lot of people using pthread_cond_wait, but that uses a mutex which essentially defeats the purpose of the semaphore implementation.

Thank you

EDIT: Adding Wikipedia's implementation of TestAndSet in C to reference in the comments

#define LOCKED 1

int TestAndSet(int* lockPtr) {
    int oldValue;

    oldValue = *lockPtr;
    *lockPtr = LOCKED;
    // -- End of atomic segment --

    return oldValue;
}

It's called a race condition. You need some king of atomic test-and-set operation, which you don't have. Otherwise both processes can see it as zero, then they will both try to change it to 1, each thinking they were the first one to do so. Basically you have no way to know that it hasn't changed between seeing it as zero and setting it to one. — Tom Karzes, Feb 27 '19 at 05:29
Exactly what my worry is. This implementation seems to work fine as a binary semaphore, because two processes won't be waiting at the same time. I'm unaware of a way to check a value and increment a value in the same statement, do you have any ideas? — Luke4211, Feb 27 '19 at 05:36
You don't *implement* a semaphore. You use one (with the help of some external library) — Basile Starynkevitch, Feb 27 '19 at 05:53
At some point someone implemented it. Understanding what your code does to the lowest level makes you a better programmer than just accepting that something works and moving on. — Luke4211, Feb 27 '19 at 05:58
`while( (*shmPtr) != 0);` Your compiler would be within its rights to optimise this loop away entirely. — n. m. could be an AI, Feb 27 '19 at 07:01
[To put my money where my mouth is](https://godbolt.org/z/c2tQq0). Here at optimisation levels 2 and 3, the loop is infinite. It won't check any memory location repeatedly. At -O2, it will check a *register*, which is quite useless. At -O3, it's just an infinite loop without any checks. So "working" it isn't. — n. m. could be an AI, Feb 27 '19 at 07:26
"At some point someone implemented it". is not possible to implement it in standard C. Somebody has implemented it for your platform, using platform-specific code. — n. m. could be an AI, Feb 27 '19 at 07:34
the semaphores must be implemented in kernel mode, not in user mode — alinsoar, Feb 27 '19 at 08:21
@alinsoar "must" is a bit harsh word, and "kernel mode" is not a fundamental constant (what's a kernel on a chip with no MMU and no privileged instructions?), but yeah, *usually* the semaphores are *best* implemented in kernel mode. — n. m. could be an AI, Feb 27 '19 at 09:56

score 1 · Answer 1 · answered Feb 27 '19 at 05:40

1

As Tom commented, for semaphores to be correct you need an atomic test-and-set or a compare-and-swap (compare-exchange).

But that is not all. Since you are using shared memory (ie. by multiple processes), the atomic operations provided by C11 (Link) are not sufficient.

Since you are calling Posix functions anyway, I asume that you have access to Posix semaphores.

"POSIX semaphores allow processes and threads to synchronize their actions." (Link)

answered Feb 27 '19 at 05:40

Bernd Elkemann

23,242
4
37
66

1

This is merely an exercise, I normally use sys/sem.h when I need to synchronize. However, I added wikipedia's implementation of testAndSet in my post, and I have a question about that. If I replaced the conditional in the while loop with a call to testAndSet, couldn't the same issue occur? If two threads are waiting and the semaphore changes to 0, suppose the first process to call testAndSet switches context after the the oldValue is read. Wouldn't this cause the exact same issue that I was having to begin with? – Luke4211 Feb 27 '19 at 05:55
Yes the same issue would occur. The "implementation" found on Wikipedia is only for illustration and has the same race-condition. – Bernd Elkemann Feb 27 '19 at 06:04

Gabriel Staples · Answer 2 · 2019-02-28T06:20:12.013

I don't know how to do it on a PC (if you find out, please do come back and post your own answer), but what you need is what I call "atomic access guards." In other words, you need a mechanism to force atomic access to a given variable for a set amount of time. What that means is that you essentially force all threads/processes to pause for a moment, while 1 and only 1 thread gets access to a variable. It then does its thing with the variable (ex: reads, modifies, writes to it), then re-enables the other threads when done. In this way, you guarantee atomic access to that variable by that thread during those operations. Now, all race conditions are solved.

In C, this is highly architecture-dependent I believe, and relies on C functions written in inline assembly code via something like the __asm keyword, and/or it relies on setting bits in specific hardware registers to certain values in order to enforce certain behavior. Example of using the __asm keyword: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.100748_0606_00_en/ddx1471430827125.html.

Sample of inline assembly code wrapped up in a C function:

int add(int i, int j)
{
  int res = 0;
  __asm ("ADD %[result], %[input_i], %[input_j]"
    : [result] "=r" (res)
    : [input_i] "r" (i), [input_j] "r" (j)
  );
  return res;
}

Once you have your "atomic access guard" functions to give you atomic access, you can do something like the following:

// atomic access guard ON

// Do whatever you want here: it's all atomic now!
// Read, modify, write, etc.
// - CAUTION: NO OTHER THREADS CAN RUN DURING THIS TIME, SO GET OUT OF THIS QUICKLY

// atomic access guard OFF

On single-core systems, such as the microcontrollers I'm familiar with (STM32 and AVR/Arduino), atomic access is ensured simply by turning off all interrupts. Ex: on ARM-core STM32 microcontrollers, do it as follows by using the necessary CMSIS (ARM-provided) functions:

// Read PRIMASK register, check interrupt status before you disable them 
// Returns 0 if they are enabled, or non-zero if disabled 
uint32_t prim = __get_PRIMASK();

// Disable interrupts 
__disable_irq();

// Do some stuff here which can not be interrupted 

// Enable interrupts back, but only if they were previously enabled (prevents nesting problems)
if (!prim) 
{
    __enable_irq();
}

Source: https://stm32f4-discovery.net/2015/06/how-to-properly-enabledisable-interrupts-in-arm-cortex-m/

If using FreeRTOS (Free Real-Time Operating System), do the following:

taskENTER_CRITICAL() // This supports nested calls, and ends up calling `portDISABLE_INTERRUPTS()` anyway.

// do your atomic access here

taskEXIT_CRITICAL()

See: https://www.freertos.org/taskENTER_CRITICAL_taskEXIT_CRITICAL.html

If using AVR-core microcontrollers, such as the ATmega328 (basic Arduino Uno processor), do the following:

uint8_t SREG_bak = SREG; //save global interrupt state
cli(); //clear (disable) interrupts

//atomic variable access guaranteed here

SREG = SREG_bak; //restore interrupt state

See my answer here: https://stackoverflow.com/a/39693278/4561887

So now you need to go do some research (and please do post back), on how to enforce such a principle in C on your given operating system and/or architecture, and/or via some special calls to your kernel or something. This may even require that you write your own inline assembly to do this stuff, then wrap it up in a C function to call.

I look forward to seeing how you accomplish it.

Update: I just did some digging in the FreeRTOS source code to see how they disable interrupts, and here's what I found for the ARM Cortex M3 processor, such as STM32 microcontrollers, when using the GCC compiler:

From "FreeRTOSv9.0.0/FreeRTOS/Source/portable/GCC/ARM_CM3/portmacro.h":

#define portDISABLE_INTERRUPTS() vPortRaiseBASEPRI()

portFORCE_INLINE static void vPortRaiseBASEPRI( void )
{
uint32_t ulNewBASEPRI;

    __asm volatile
    (
        "   mov %0, %1                                              \n" \
        "   msr basepri, %0                                         \n" \
        "   isb                                                     \n" \
        "   dsb                                                     \n" \
        :"=r" (ulNewBASEPRI) : "i" ( configMAX_SYSCALL_INTERRUPT_PRIORITY )
    );
}

Implementation of a semaphore in C

2 Answers2