Why isn't atomic double fully implemented

Question

My question is quite simple. Why isn't std::atomic<double> implemented completely? I know it has to do with atomic RMW (read-modify-write) access. But I really don't see, why this shouldn't be possible on a double.

It's specified that any trivially copyable type can be used. And of course double is among them. So C++11 requires the basic operations (load, store, CAS, exchange, etc.) that you can use with any class type.

However, on integers an extra set of operations is possible (fetch_add, ++, +=, etc).

A double differs very little from these types. It's native, trivially copyable, etc. Why didn't the standard include the double with these types?

Update: C++20 does specialize std::atomic<T> for floating-point types, with fetch_add and sub. C++20 std::atomic<float>- std::atomic<double>.specializations But not atomic absolute-value (AND) or negate (XOR).

Editor's note: Without C++20 you can roll your own out of CAS; see Atomic double floating point or SSE/AVX vector load/store on x86_64 for portable examples; atomic<double> and float are lock-free on most C++ implementations.

I'd guess the reason is that most CPUs don't support atomic `double` operations. So how would you implement it? — Angew is no longer proud of SO, May 05 '15 at 09:08
Is `double` even trivially copyable in the strictest sense seeing how its memory storage is not identical with register storage on most current CPUs? But even if it is, does it make sense to perform atomic operations on floating point data at all? You could probably say _"sure, why not?"_, but it's hard for me to imagine why one would want to do any such thing. Integers and pointers and double pointers type-punned as large integers, sure. But floating point data? — Damon, May 05 '15 at 09:58
@Damon Well this might sound naive, but if i have a double value inside a class. And i want to write it on 1 thread and read it on another. Atomic, to me seems the way to go. — laurisvr, May 05 '15 at 10:00
I do that kind of thing by submitting a "task" (which is really just a struct of a function pointer and a void pointer) to a queue. The other thread pulls tasks from the queue and invokes the function pointer. The void pointer points to "whatever data", presumably an input and output buffer, or in your case one or several `double` values, whichever is needed for "task". Once done, the worker thread posts the pointer to data onto the "results" queue from which the main thread can pull them. — Damon, May 05 '15 at 10:05
@Damon That sounds like a custom implementation of the std::future, and std::promise framework right? In my particular case I have a double value on the worker thread that changes every once so often. And i want to cout this on the main thread. It's not really necessary to always have the very latest value. And std::atomic has the least overhead for this. If I'm not mistaken? — laurisvr, May 05 '15 at 10:09
In that particular case, `std::atomic` is the most desirable thing, yes. But expect it to be implemented with a mutex, I'm pretty sure it's not lockfree on any mainstream architecture (this would very much surprise me). I wonder whether it might be worth punning the `double` into an `int64_t` and back. Not precisely the nicest thing to do, but this _will be_ lockfree on (almost) every platform, and the cost to convert between floating point and integer is probably more or less equivalent to a mutex on fast path (spinning) and a few dozen times faster when congested (syscall). — Damon, May 05 '15 at 10:14
@Damon Well, `std::atomic_is_lock_free` actually says atomic double is lock free for the msvc2013 compiler:). So at least in this particular case it seems to be lock free:). — laurisvr, May 05 '15 at 11:18
@Damon: The standard doesn't care about registers. "Trivially copyable" is just about copying from one memory location to another using memcpy (simply put). The architecture would have to be very strange if it didn't support that. — Arne Vogel, May 05 '15 at 14:32
@Damon: You seem to be talking about obsolete x87 with 80-bit registers. Even if you are compiling without SSE2, `fld qword` / `fstp qword` is a copy for `double`. Converting to 80-bit internal format and back will never change the result, because every `double` can be exactly represented in that format. ([denormals are normalized when loading](https://software.intel.com/en-us/articles/x87-and-sse-floating-point-assists-in-ia-32-flush-to-zero-ftz-and-denormals-are-zero-daz), but the extra exponent range always allows it to exactly represent the value). — Peter Cordes, Sep 04 '17 at 06:32
x87 doesn't have flush-to-zero or denormals-are-zero settings, and rounding mode doesn't come into play (because every `double` can be exactly represented). The only thing that could munge a `double` is [if the x87 precision mode was set to 24-bit (`float`) mantissa](https://randomascii.wordpress.com/2012/03/21/intermediate-floating-point-precision/). Fun fact, gcc uses `fild`/`fistp` to implement `std::atomic` load/store on 32-bit x86. (from/to integer avoids raising FP exceptions). `fld` can raise exceptions if they're unmasked. — Peter Cordes, Sep 04 '17 at 06:38
But anyway, SSE2 is baseline for 64-bit, and a lot of 32-bit software is built with SSE2 enabled. In that case, there's absolutely no weirdness. Either way, `std::atomic` is lock-free on gcc/clang/msvc. https://stackoverflow.com/questions/45055402/atomic-double-floating-point-or-sse-avx-vector-load-store-on-x86-64 — Peter Cordes, Sep 04 '17 at 06:39
Related: [Atomic double floating point or SSE/AVX vector load/store on x86\_64](//stackoverflow.com/q/45055402) — Peter Cordes, Nov 19 '19 at 21:50
@curiousguy: "trivially copyable", not "copiable" was the correct spelling. I only mention this in case you were going to edit anything else to make the same change. The rest of your edit looks reasonable. "interlocked" was asking about the underlying implementation being possible using x86 Windows terminology which is sort of ok, but sure, HW support for atomic RMW is the same thing. — Peter Cordes, Nov 20 '19 at 23:40
@PeterCordes Sorry for the spelling copyable which didn't spell checked (I added it to perso dict now!). [tag:interlocked] has only 1 reference: "Interlocked Class on MSDN". — curiousguy, Nov 20 '19 at 23:46
@curiousguy: see https://learn.microsoft.com/en-us/windows/win32/api/winnt/nf-winnt-interlockedexchange and friends, like `InterlockedXor64` — Peter Cordes, Nov 20 '19 at 23:59

score 23 · Accepted Answer · answered May 05 '15 at 10:25

std::atomic<double> is supported in the sense that you can create one in your program and it will work under the rules of C++11. You can perform loads and stores with it and do compare-exchange and the like.

The standard specifies that arithmetic operations (+, *, +=, &, etc.) are only provided for atomics of "integral types", so an std::atomic<double> won't have any of those operations defined.

My understanding is that, because there is little support for fetch-add or any other atomic arithmetic operations for floating point types in hardware in use today, the C++ standard doesn't provide the operators for them because they would have to be implemented inefficiently.

(edit). As an aside, std::atomic<double> in VS2015RC is lock-free.

Btw, it looks like std::atomic is now (C++20) required to have some support for `fetch_add`/`fetch_sub`. — Dan M., Apr 19 '18 at 18:20

Richard Hodges · Answer 2 · 2019-03-09T10:27:57.833

9

The standard library mandates std::atomic<T> where T is any TriviallyCopyable type. Since double is TriviallyCopyable, std::atomic<double> should compile and work perfectly well.

If it does not, you have a faulty library.

Edit: since comment clarifying the question:

The c++ standard specifies specific specialisations for fundamental integral types. (i.e. types that contain integers that are required to be present in the language). These specialisations have further requirements to the general case of atomic, in that they must support:

fetch_add
fetch_sub
fetch_and
fetch_or
fetch_xor
operator++
operator--
comparison and assignment operators

OR, XOR, AND are of course not relevant for floating types and indeed even comparisons start to become tricky (because of the need to handle the epsilon). So it seems unreasonable to mandate that library maintainers make available specific specialisations when there is no case to support the demand.

There is of course nothing to prevent a library maintainer from providing this specialisation in the unlikely event that a given architecture supports the atomic exclusive-or of two doubles (it never will!).

edited Mar 09 '19 at 10:27

answered May 05 '15 at 09:15

Richard Hodges

68,278
7
90
142

1

I should have been more clear. I'll expand my question a bit. But on the page I linked to in the question. A number of types are listed to have a complete specialization of std::atomic<>. Double isn't among these. – laurisvr May 05 '15 at 09:34
2

Atomic AND is relevant for `fabs()` (clearing the sign bit). XOR/OR can also usefully mess with the sign bit. Fun fact: a C++ standards proposal (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0020r5.html) is planning to add `fetch_add` to atomic floating-point types, since apparently some hardware supports it efficiently. See also [Atomic double floating point or SSE/AVX vector load/store on x86_64](https://stackoverflow.com/questions/45055402/atomic-double-floating-point-or-sse-avx-vector-load-store-on-x86-64) for an asm perspective, and some inefficient compiler output :( – Peter Cordes Sep 04 '17 at 06:43
Interesting. Thanks @PeterCordes – Richard Hodges Sep 04 '17 at 07:20
the link above has been dead – phuclv Mar 09 '19 at 08:30
@phuclv yes I see. I'll remove it – Richard Hodges Mar 09 '19 at 10:27

Why isn't atomic double fully implemented

2 Answers2

Linked