Interoperabilty between C and C++ atomics

Question

Suppose, I have a task that might be cancelled from another thread. The task is performed in a C function, another thread runs C++ code. How do I do that?

Rough example.

C:

void do_task(atomic_bool const *cancelled);

C++:

std::atomic_bool cancelled;
…
do_task(&cancelled);

For now, I created a file atomics.h with the following content:

#ifdef __cplusplus
#include <atomic>
using std::atomic_bool;
#else
#include <stdatomic.h>
#endif

It appears to work, but I don't see any guarantees for that. I wonder, if there is a better (correct) way.

for what you here need atomic ? which operation need to be atomic ? nothing. you here need *volatile* by sense, but not atomic - `void do_task(volatile bool *cancelled);` — RbMm, Dec 22 '18 at 21:47
really in most case you even not need *volatile* if inside loop `do { * } while (!*cancelled);` you call some external function, for which compiler can not know - are it modify `*cancelled` — RbMm, Dec 22 '18 at 21:56
@RbMm: Advice to do things that might happen to work on current tooling but that is **explicitly wrong** and has no advantages over doing it right is not helpful. — R.. GitHub STOP HELPING ICE, Dec 23 '18 at 08:14
@R.. - why this is *explicitly wrong* ? of course without view complete code can not exactly say are need atomic here, but how usual *cancelled* use - not require any atomic or memory order other than relaxed. need only *volatile*. — RbMm, Dec 23 '18 at 08:37
@RbMn one threads reads a variable. one thread writes it. all without further synchronization-> you need atomic. end of story. — phön, Dec 23 '18 at 09:32
@phön - which synchronization you mean - concrete example ? if one threads reads a variable. one thread writes this yet not mean that synchronization here need. in typical scenario of `cancelled` usage - it not need. i describe this in more details in self answer. look like you use word *synchronization* without understand what is it and when need and when not — RbMm, Dec 23 '18 at 10:29
@RbMm you can synchronize the access to a variable with a mutex to make sure that no thread writes while the other thread reads. "if one threads reads a variable. one thread writes this yet not mean that synchronization here need" indeed it does mean that except that variable is atomic. — phön, Dec 23 '18 at 11:39
@phön - in this concrete case - for what mutex need ? it absolute not need here. and in case typical usage of `cancelled` - here only 2 values - 0 and not 0. loop break when not 0. in this case even if assume that read or write not atomic - we anyway read not 0 after another thread write not 0 to `canceled` (the read and write operation by hardware on bool type of course atomic (unlike rmw operation) but even this not need here) — RbMm, Dec 23 '18 at 11:45
@phön if thread#1 do write to `x` and thread#2 read from `x` in general case if read or write can not be done by single hardware atomic operation (say `x` big and complex) thread#2 can read partial state of modification which do thread#1. if we use `bool` in sense only 0 or !0 here no problem with partial state - we anyway read or 0 or not 0. if OP need more complex usage of `canceled` - say need RMW operation - set it to true and check are it already was true in atomic operation, or need synchronize read/write `canceled` with another load/store to memory - here yes, need atomic. — RbMm, Dec 23 '18 at 11:54
@RbMm its not only about the atomicity of the variable itself (which will likely not cause any problems on the hardware level in practice since the bool will fit into on register), but on the language level this is straight undefined behaviour and the compiler may assume it will not happen. so it may for example reorder instructions around this bool which were not intended by the programmer or even dismiss the store or load entirely since there is nobody (by language rules, because no atomic used) who will see sideeffects. — phön, Dec 23 '18 at 12:20
@phön - *even dismiss the store or load entirely* for this and need `volatile` which prevent this. and what is *undefined* behavior here - you can show concrete example and explain what is concrete can be undefined if we use loop `do {.. } while (!cancelled);` in one thread and `cancelled = true` in another. i not view any problem here even on abstract language layer. no problem from formal language layer read and write for the same not atomic variable .about atomicity- how i say even this not need if we use the only 2 state of value - 0 and not 0. can you on concrete example show ub or like — RbMm, Dec 23 '18 at 12:27
@phön Cancellation is inherently a polite request; you can't expect it to stop before something is done, you only want it to stop a loop in a reasonably timely way. Reordering some instructions around a read of a cancel flag is *not* a problem. So here `volatile` semantics is sufficient. — curiousguy, Feb 12 '19 at 03:11
@curiousguy and still the point stands that this is undefined behavior as the standard says. why teach the wrong stuff? (nevertheless i cant come up with a similar example which does not work with the volatile). — phön, Feb 12 '19 at 07:34
@phön The behavior of `volatile` is defined by the ABI and the CPU. If the CPU allows such concurrent modifications and provide atomicity (they all do for such natural word operations), it will work. It's guaranteed by the CPU and the C/C++ semantics. So it's fully defined in practice on all known CPU. — curiousguy, Feb 12 '19 at 12:09
@curiousguy i can just repeat what i already said: it is undefined behavior if you look into the c++ standard. maybe it works NOW on all platforms we know of. maybe not. maybe we invent some crazy cpu infrastructure in the next decades where your "in practice it works" does not longer hold true. (well maybe this wont happen, but you get the point). use what suits you best. i dont care. i just spread the word. you know the risks. use it. or be pragmatic in YOUR code. but dont teach wrong things and say: this is the perfect solution. my 2 cents — phön, Feb 12 '19 at 12:27

score 10 · Accepted Answer · answered Dec 22 '18 at 20:02

10

The atomic_bool type in C and the std::atomic<bool> type in C++ (typedefed as std::atomic_bool) are two different types that are unrelated. Passing a std::atomic_bool to a C function expecting C's atomic_bool is Undefined Behavior. That it works at all is a combination of luck and the simple definitions of these types being compatible.

If the C++ code needs to call a C function that expects C's atomic_bool, then that is what it must use. However, the <stdatomic.h> header does not exist in C++. You'll have to provide a way for the C++ code to call C code to get a pointer to the atomic variable you need in a way that hides the type. (Possibly declare a struct that holds the atomic bool, that C++ would only know that the type exists and only know about pointers to it.)

answered Dec 22 '18 at 20:02

1201ProgramAlarm

32,384
7
42
56

It doesn't matter that they are different types. They basically synchronize the same byte of memory via same memory operations. `struct A{int x; int y; int z};` and `struct B{int a[3];};` are different but there is no problem when converting one from the other via brute-force conversions. There is no undefined behavior. Same with `atomic_bool` and `std::atomic`. – ALX23z Dec 22 '18 at 20:36
1

from https://developers.redhat.com/blog/2016/01/14/toward-a-better-use-of-c11-atomics-part-1/ "The atomic types are fully interoperable between the two languages so that programs can be developed that share objects of atomic types across the language boundary." – Manny_Mar Dec 22 '18 at 23:29
1

@ALX23z _"are different but there is no problem when converting one from the other via brute-force conversions. There is no undefined behavior"_ Really? – Lightness Races in Orbit Dec 23 '18 at 01:25
@LightnessRacesinOrbit "*Really?*" Of course, from POV of CPU it is just 3 consecutive integers - 12 bytes of data. Who cares what you called these variables in your source code? Definitely not the program after you compiled it. Perhaps, 100 years into the future they will decide that compiler can rearrange internal variables in any order but it is not the case now or in any foreseeable future. Try and find system where `A a; B *b = (B*)(void*)&a;` won't work correctly. – ALX23z Dec 23 '18 at 02:19
4

@ALX23z: "from POV of CPU" doesn't matter. That's not how undefined behavior works. – user2357112 Dec 23 '18 at 02:22
@user2357112 lol, no CPU's POV is the one that matters. There is also the compiler with its abuse of undefined behavior for optimizations but these are about code flow, not data storage. So have you found a system where brute force casting from A to B will not work properly? – ALX23z Dec 23 '18 at 03:33
Relevant: [P0943](http://wg21.link/p0943). The proposal hasn't been adopted yet, but it looks like the standard is moving towards providing a header for source-level compatibility and suggesting that implementations should also offer full compatibility. – bogdan Feb 01 '19 at 02:16
@LightnessRacesinOrbit Really, as soon as they are in separately compiled functions linked with an ABI that says they have the same layout. The functions may be in the same TU as long as they have an ABI interface and not an internal interface. – curiousguy Feb 12 '19 at 03:16
@user2357112 There is no C/C++ standard so any program with C++ and C doesn't have a "defined" behavior WRT to a standard. Compiler documentation mentions an ABI which makes reinterpretation of memory OK as soon as you go through an ABI call (not an internal call). – curiousguy Feb 12 '19 at 03:18
@ALX23z Pointers point to precise objects, not just to an address in memory. (Yes that is unclear in the std and contradicts some other stuff.) A pointer to an object of some type doesn't point to an object of another type at the same address. The C++ was extremely poorly specified in that basic area but the committee worked on basic stuff a lot lately. – curiousguy Feb 12 '19 at 03:20
@curiousguy: If you have ABI spec citations, that'd be useful information that would be worth putting in an answer. (I was really expecting someone to post ABI docs ages ago, but it didn't happen.) – user2357112 Feb 12 '19 at 03:22
1

@curiousguy But you have to reach the "compiled" and "linked" phase first. Violating strict aliasing can make your compiler do unexpected things, because you're lying to it! – Lightness Races in Orbit Feb 12 '19 at 11:06
@user2357112 1) There is no std ABI, only specific ABIs. 2) There is no system specific tag (x86, x86-64...) on this Q, so no ABI answer. 3) The principle of an ABI is that only deals with layout not types. That makes it possible to link between languages that don't have the same type system. Anything that has a compatible layout can be used **across** an ABI boundary. Calling an internal function isn't such a boundary: the compiler doesn't have to make a true call and can inline a function called without separate codegen. – curiousguy Feb 12 '19 at 11:38
@LightnessRacesinOrbit Most (all?) compilers allow you to link units that have separate codegen if they respect the same ABI. Nothing in the std guarantees that and a compiler could only allow linking code compiled at the same time with the exact same compiler version (such compiler wouldn't be usable in general). **The only contract between codegen units compiled "separately" is the ABI.** (Separate here means not seeing other code.) A codegen unit doesn't have to be a TU; compiling several TU together allows more inlining or propagation of information because the contract isn't the ABI. – curiousguy Feb 12 '19 at 11:44
There is a domain of application of every law and the standards (plural, there is no C/C++ unified std) each have their own jurisdiction. If you mix different languages, or different dialect variations, or the exact same dialect but separately compiled, **you have distinct jurisdictions with a frontier.** I don't know whether that was seriously discussed by either committee. The standard doesn't seem to discuss in any way, shape, or form, its relation the ABI, the separate linking etc. That doesn't mean it doesn't exist in practice. – curiousguy Feb 12 '19 at 11:57
1

@curiousguy This has nothing to do with ABIs, codegen or anything like that. It is about the contract you made with the C++ compiler. You wrote a program that has undefined behaviour, and that's that. – Lightness Races in Orbit Feb 12 '19 at 12:22
@LightnessRacesinOrbit What part of the contract allows to call code written in C from C++? What makes any C type compatible with C++? Where in the C++ std is a normative reference to the core C language made? – curiousguy Feb 12 '19 at 14:02
@curiousguy Irrelevant. Your C++ program has UB. Your C program has UB. This is true before you link the two or do anything else, period. – Lightness Races in Orbit Feb 12 '19 at 14:03
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/188284/discussion-between-curiousguy-and-lightness-races-in-orbit). – curiousguy Feb 12 '19 at 14:04
@curiousguy In case of complicated objects a lot may work differently (like virtual inheritance), but we are not discussing such objects. `std::atomic` is not different from `class boolWrap{bool m_a; /*and a variety of simple functions */};` which is still just a boolean with a different set of functions. You may use `union{std::atomic m_a; bool m_b;}` and you won't notice difference between `m_a` and `m_b` as long as you work in a single thread. – ALX23z Feb 13 '19 at 17:24

Maxim Egorushkin · Answer 2 · 2018-12-22T23:52:06.440

To side-step all ABI issues you may like to implement a C function that is called from C++ and operates on that atomic_bool. This way your C++ code doesn't need to know anything about that global variable and its type:

In an .hfile:

#ifdef __cplusplus
extern "C" {
#endif

void cancel_my_thread(void);
int is_my_thread_cancelled(void);

#ifdef __cplusplus
}
#endif

And then in a .c file:

#include <stdatomic.h>

static atomic_bool cancelled = 0;

void cancel_my_thread(void) {
    atomic_store_explicit(&cancelled, 1, memory_order_relaxed);
}
int is_my_thread_cancelled(void) {
    return atomic_load_explicit(&cancelled, memory_order_relaxed);
}

The C++ code would include that headed and call cancel_my_thread.

Thank you, it is a sane solution, however in my particular case, `do_task` might be called from several threads, so using using global variable wouldn't work. I guess, I'll just wrap it in a struct as @1201ProgramAlarm suggests. — grepcake, Dec 23 '18 at 09:13

Manny_Mar · Answer 3 · 2019-02-06T23:38:39.607

I found this on a net search https://developers.redhat.com/blog/2016/01/14/toward-a-better-use-of-c11-atomics-part-1/

Following the lead of C++, along with a memory model describing the requirements and semantics of multithreaded programs, the C11 standard adopted a proposal for a set of atomic types and operations into the language. This change has made it possible to write portable multi-threaded software that efficiently manipulates objects indivisibly and without data races. The atomic types are fully interoperable between the two languages so that programs can be developed that share objects of atomic types across the language boundary. This paper examines some of the trade-offs of the design, points out some of its shortcomings, and outlines solutions that simplify the use of atomic objects in both languages.

I am just learning about atomics now, but it looks like its compatible between C and CPP.

EDIT

Another source Multi-Threading support in c11

Do you have a more authoritative source than "some guy's blog said it was okay"? — user2357112, Dec 23 '18 at 02:16
@user2357112 I was reading stuff and came across another source on stackoverflow. I will edit my post. — Manny_Mar, Feb 06 '19 at 23:37

RbMm · Answer 4 · 2018-12-23T09:07:02.600

how i understand your code in general is (must be) next

// c code

void _do_task();

void do_task(volatile bool *cancelled)
{
  do {
    _do_task();
  } while (!*cancelled);
}

// c++ code

volatile bool g_cancelled;// can be modify by another thread
do_task(&cancelled);

void some_proc()
{
  //...
  g_cancelled = true;
}

i be ask question - are here we need declare cancelled as atomic ? are we need atomic here ?

atomic need in 3 case:

we do Read-Modify-Write operation. say if we need set cancelled to true and check are it was already true. this for example can be need if several threads set cancelled to true and who do this first need free some resources.

if (!cancelled.exchange(true)) { free_resources(); }
the read or write operation for type need to be atomic. of course on all current and all possible future implementations this is true for bool type (despite formal not defined). but even this is not important. we here check cancelled only for 2 values - 0 (false) and all another. so even if both write and read operation on cancelled assume not atomic, after one thread write non-zero to canceled, another thread sooner or later will read modified non-zero value from canceled . even if it will be another value, not the same first thread write: for example if cancelled = true translated to mov cancelled, -1; mov cancelled, 1 - two hardware, not atomic operation - second thread can read -1 instead final 1 (true) from canceled, but this not play role if we check only for non-zero - all another values break loop - while (!*cancelled); if we use here atomic operation for write/read cancelled - nothing change here - after one thread atomic write to it another thread sooner or later will read modified non-zero value from canceled - atomic operation or not - memory is common - if one thread write to memory (atomic or no) another threads sooner or later will view this memory modification.
we need synchronize another read/writes with cancelled. so we need synchronization point between 2 threads around canceled with memory order other than memory_order_relaxed say for example next code:

//

void _do_task();

int result;

void do_task(atomic_bool *cancelled)
{
    do {
        _do_task();
    } while (!g_cancelled.load(memory_order_acquire));

    switch(result)
    {
    case 1:
        //...
        break;
    }
}

void some_proc()
{
    result = 1;
    g_cancelled.store(true, memory_order_release);
}

so we not simply set g_cancelled to true here, but before this
write some shared data (result) and want that another thread after view modification of g_cancelled, will be also view modification of
shared data (result). but i doubt that you actually use/need this
scenario

if none of this 3 things is need- you not need atomic here. what you really need - that one thread just write true to cancelled and another thread all time read value of cancelled (instead do this once and cache result). usual in most case of code this will be done auto, but for be exactly you need declare canceled as volatile

if however you by some reason need exactly atomic (atomic_bool), because you here cross the border of languages, you need understand concrete implementation of atomic_bool in both languages and are it the same (type declaration, operations (load, store, etc)). by fact atomic_bool is the same for c and c++.

or (better) instead of make visible and share type atomic_bool use interface functions like

bool is_canceled(void* cancelled);

so code can be next

// c code
void _do_task();

bool is_canceled(void* cancelled);

void do_task(void *cancelled)
{
    do {
        _do_task();
    } while (!is_canceled(cancelled));
}

// c++ code

atomic_bool g_cancelled;// can be modify by another thread

bool is_canceled(void* cancelled)
{
    return *reinterpret_cast<atomic_bool*>(cancelled);
}

void some_proc()
{
    //...
    g_cancelled = true;
}

do_task(&g_cancelled);

but again i doubt that in your task you need atomic_bool by semantic. you need volatile bool

ALX23z · Answer 5 · 2018-12-22T21:44:32.403

Atomicity of operations is caused by hardware, not software (well, in C++ there are also "atomic" variables that are atomic in name only, those are implemented via mutexes and locks). So, basically, C++ atomics and C atomics do the very same thing. Hence as long as the types are compatible there won't be issues. And C++11 and C11 atomic classes were made to be compatible.

Apparently, people do not understand how atomics and locks work and require further explanation. Check out current memory models for more information.

1) We will start with basics. What and why are atomics? How does memory works?

Memory Model: think of processor as several independent cores and each has its own memory (cashes L1, L2, and L3; and in fact, L3 cash is common but it isn't really important).

Why do we need atomic operation?

If you don't use atomics, then each processor might have its own version of the variable 'x' and they are in general not synchronized. There is no telling when they will perform synchronizations with RAM/L3 cash.

When atomic operations are used, such memory operations are used that ensure synchronization with RAM/L3 cash (or whatever is needed) - ensuring that different cores have access to the same variable and not have variety of different versions of it.

Nobody cares if it is C, C++, or whatever language you use - as long as one ensures memory synchronization (both read, write, and modify) there will never be no issues.

2) OK, what about locks and mutexes?

Mutexes tend to work with OS and have queue over which thread should be allowed next to perform. And they enforce stricter memory synchronization than atomics do. With atomics one can syhchronize just the variable itself or more depending on the request / which function you call.

3) Say I have atomic_bool, can it work in interchangeably on different languages (C/C++11)?

Normally a boolean can be sychronized via memory operations (you're just synchronizing a single byte of memory from their perspective). If the compilers are aware that the hardware can perform such operations then they surely will use them as long as you use the standard.

Logical atomics (any std::atomic< T > with T having wrong size/alignment) are synchronized via locks. In this case it is unlikely that different languages can use them interchangeably - if they have different methods of usage of these locks, or for some reason one decided to use a lock and the other one came to conclusion that it can work with atomic hardware memory synchronizations... then there will be issues.

If you use atomic_bool on any modern machine with C/C++, it will surely be able to synchronize without locks.

"And C++11 and C11 atomic classes were made to be compatible" Can you cite any source for this claim? Moreover, std::atomic_bool is not guaranteed to be lock free as opposed to std::atomic_flag. — idmean, Dec 22 '18 at 19:12
@idmean atomic_bool is not guaranteed to be lock free by the standart, as it doesn't require it to be. Whether it is lock free or not is something you ought to ask hardware whether it supports atomic operations. Locks are also in sense atomics, as they force more strict memory syncronization, while atomics allow way more relaxed memory sync. — ALX23z, Dec 22 '18 at 19:22
It reads like a rant by a newcomer who's mad that experts are telling them they're wrong. — R.. GitHub STOP HELPING ICE, Dec 23 '18 at 08:18

Interoperabilty between C and C++ atomics

5 Answers5