atomic_ref when external underlying type is not aligned as requested

Question

I read on p0019r8 the following:

atomic_ref(T& obj);
Requires: The referenced object shall be aligned to required_alignment.

cppreference interprets this as UB when not aligned:

The behavior is undefined if obj is not aligned to required_alignment.

So how would you expect from an implementation to handle it?

And implementation can check compile-time alignof, but in reality a type might be aligned more that alignof. An implementation can interpret pointer bits and check runtime alignment, but it is extra runtime check.

Ultimately I see the following options:

Do nothing - implement runtime undefined behavior in unpleasant way, favor correct uses only
Check compile-time alignment (alignof) and emit warning if wrong
~~Check compile-time alignment (alignof) and fail at compile-time if wrong~~ not correct since actual alignment may be greater than visible by static type
~~Check compile-time alignment (alignof) and fail at run-time if wrong~~ not correct since actual alignment may be greater than visible by static type
Check compile-time alignment (alignof) and fallback to lock-based if wrong
Check run-time alignment (pointer bits) and fail at run-time if wrong
Check run-time alignment (pointer bits) and fallback to lock-based if wrong

You can't really expect *anything* from undefined behavior because its... undefined. So relying on it, particularly in this circumstance, is just not helpful. — Nicol Bolas, May 25 '20 at 05:08
I'm not asking what I, as a _user_, should do, I'm asking what should I do if I was an _implementer_ :-) — Alex Guteniev, May 25 '20 at 05:11
@AlexGuteniev As an implementer you can do _anyting you want_, when the behavior is undefined. I guess you do not understand this term fully. All of the mentioned options "implement" undefined behavior, not just the first one. — Daniel Langr, May 25 '20 at 05:46
Fair. By tthe first _undefined behavior_ I meant implementing _unpleasant behavior_ by not caring about this case at all. — Alex Guteniev, May 25 '20 at 05:51
As for your options, you basically cannot check alignment at compile time. A reference is likely implemented via pointers internally and, generally, you can't work with pointers at compile time. What if I have `int ptr;` and write `atomic_ref(*ptr)`? How could a compiler check the alignment? Checking at runtime would make sense only in some debug mode, otherwise, it would create a runtime overhead and nobody wants this when using atomics. — Daniel Langr, May 25 '20 at 06:01

Peter Cordes · Accepted Answer · 2020-05-25T06:39:48.323

5

TL:DR: never silently fall back to locking, nobody ever wants that because it defeats a major part of the purpose of std::atomic. Think of non-lock-free as a portability fallback, not a viable mode of operation.

Being UB makes it legal for the compiler to simply assume without any checking that it's aligned. Being able to assume without any runtime checks is one of the major benefits of the concept of UB. This is what most people want / expect at runtime in an optimized build, not bloating the code with conditional branches that might fall back to using a mutex.

The choice of whether (and how) to define any behaviour here is completely up to the implementation, as a matter of Quality-of-Implementation and trading off performance vs. debugging. I think you know that and are literally asking what users want compilers to pick for those QoI choices, which is fine.

As the P0019 proposal you linked says, it all comes down to a QOI issue:

Reference-ability Constraints

An object referenced by an atomic reference must satisfy possibly architecture-specific constraints. For example, the object might need to be properly aligned in memory or might not be allowed to reside in GPU register memory. We do not enumerate all potential constraints or specify behavior when these constraints are violated. It is a quality-of-implementation issue to generate appropriate information when constraints are violated.

The "generate appropriate information" phrasing implies they expect implementations to warn / error if they detect a violation, not fall back to locking.

Although an implementation that could fall back to locking could foolishly set required_alignment to the minimum for correctness (1), rather than the minimum for lock-freedom. Of course nobody wants that, but it's a QoI issue not standards-compliance.

I'd expect (or at least hope) for an implementation to work as follows:

Warn at compile time if atomic_ref is used on any object whose alignof is less than required_alignment. You might know that a certain T *p happens to be 8 byte aligned even though alignof(T) is only 1 or 4, so this shouldn't be an error.

Some local way of silencing the warning would be a good thing. (Alternative: promise alignment to the compiler with something like GNU C x = __builtin_assume_aligned(x, 16))

At least warn if an object is definitely known to be under-aligned at compile time, e.g. a sub-member of struct whose alignment is known, or a global var where the declaration is visible but didn't include alignas. Warning for access through pointers that might be under-aligned is noisier and should be separately disable-able.
Extra-slow Debug mode: runtime check of alignment, warn or abort on a specific object being under-aligned for atomicity. (e.g. gcc -fsanitize=undefined, or MSVC's debug mode which already adds stuff like std::vector::operator[] bounds checks. I think GCC's UBSan does even more checking than MSVC debug mode, e.g. for signed overflow; I think MSVC debug mode is somewhere in between gcc -O0 and gcc -O0 -fsanitize=undefined.)
"Release" mode: zero checking, just emit asm whose correctness depends on the object being aligned. (Also gcc -O0 without UBSan, which allows consistent debugging but doesn't add extra checks.)
Nobody ever wants silent fallback to mutexes at compile time or run-time. That mode of operation basically just exists so ISO C++ can require the feature to be supported everywhere without making it impossible to implement on some targets.

The fallback to locking is usually very sub-optimal compared to manual fine-grained locking of a critical section that does a few related atomic ops at once on a data-structure designed for it. People use atomic<T> (and the upcoming atomic_ref<T>) for performance, and much of that performance is destroyed by locking. Especially read-side scalability.

Footnote 1: IIRC, alignof() is only specified for types, not objects, but in GNU C++ it also works on objects. I'm using this as shorthand for the compiler's internal knowledge that a certain object used alignas() to over-align it.

edited May 25 '20 at 06:39

answered May 25 '20 at 06:07

Peter Cordes

328,167
45
605
847

Do you think it worth to distinguish _Extra-slow Debug mode_ and normal debug mode? Since `atomic_ref` behaves like a reference (internal pointer cannot be reassigned and in my current implementation _is_ actually a reference), all checks (compile-time warning and runtime check) are restricted to constructor. – Alex Guteniev May 25 '20 at 06:16
3

@AlexGuteniev: I think you'd normally construct a new `atomic_ref` every time you use it, or at least once per function call and reuse it in a loop. You don't want to store it anywhere other than a local temporary; that would add a useless level of indirection to the generated asm. (Constructing a local tmp `atomic_ref` should cost 0 asm instructions). So I'd expect the `atomic_ref` constructor to run very frequently, some significant fraction of the number of atomic operations done via `atomic_ref`. – Peter Cordes May 25 '20 at 06:18
@AlexGuteniev: IDK if MSVC has a good mechanism for enabling extra checks. In the GCC world, I'd expect runtime alignment checking to be part of UBSan `-fsanitize=undefined`, not on by default at `-O0`. But MSVC does do some extra stuff in debug builds like poisoning the stack frame in functions to help detect read-uninitialized, and std::vector bounds checking, so maybe it would be appropriate there. i.e. MSVC's regular debug mode is already an "extra slow debug mode". You do want some way of detecting misalignment bugs at runtime. – Peter Cordes May 25 '20 at 06:21
1

@AlexGuteniev Fully agree with Peter. I, personally, would expect that the constructor of `std::atomic_ref` is basically just NOOP. The main purpose of `atomic_ref` is to enable atomic memory operations for non-atomic types. Live demo: https://godbolt.org/z/YabuVt. Note that there are no instructions emitted for the constructor, just for atomic increment. – Daniel Langr May 25 '20 at 06:34
1

From my side, I agree with Peter on how MSVC debug mode is implemented. In particular, you don't get no-op in debug simply because constructor is not inlined. I don't expect this particular runtime check is too expensive, it is just a test against low bits and a branch, which would become predicted in a loop of correct use. So for MSVC implementation, I'd like here a macro such as `_ATOMIC_REF_ALIGNMENT_CHECK`, so that it is defaults in release as 0, and in usual debug mode as 1, but can be bypassed in debug by defining as 0. – Alex Guteniev May 25 '20 at 06:56
@AlexGuteniev: Yup, that sounds totally reasonable. Debug-mode code-gen is total garbage for performance in code using lots of small functions, including standard library templates; an extra check is not a dealbreaker. The main issue is choosing what to do when you detect a problem (probably why `gcc -O0` doesn't add any new checks); if there's some well established way to abort then do that. – Peter Cordes May 25 '20 at 07:08
1

@PeterCordes, an established practice is to use `_invalid_parameter`, so the user can set behavior with `_set_invalid_parameter_handler` / `_set_thread_local_invalid_parameter_handler`, and the default is to throw uncatchable exception. And to control the check with a dedicated macro, which has different values for debug and release, but can be redefined for both - see `_ITERATOR_DEBUG_LEVEL` as the most known example. – Alex Guteniev May 25 '20 at 09:09
I didn't find a reason to use `__builtin_assume_aligned`, though latest preview MSVC supports it. All accesses are via intrinsics anyway, so codegen is not affected. As an information to the compiler to issue warning on definitely misaligned pointers, it is pretty much ignored (not only by MSVC, by gcc as well) – Alex Guteniev May 26 '20 at 05:49
1

@AlexGuteniev: I wasn't suggesting the *implementation* of `atomic_ref` would want to use it. I meant that code *using* `atomic_ref` on a pointer deref might silence a warning by promising that the pointer was extra-aligned, e.g. a `uint64_t*` when compiling for 32-bit mode. I haven't really thought through what would be too noisy or not by default, or what is even practically possible for a header to check, depending on what compiler support is available for querying known alignment. – Peter Cordes May 26 '20 at 05:51
So you expect `__builtin_assume_alignment` to _produce_ assumption on output. But how would you see this assumption to be _consumed_? As `__builtin_assume_aligned`, as well as `std::assume_aligned` based on it does not put any `alignas` on type, it should be consumed by some builtin as well. My understanding that `__builtin_assume_alignment` can also _consume (check if possible)_ assumption on input. So, `__builtin_assume_alignment` is both outside promise and inside check. This does not work, though: https://godbolt.org/z/ST3LCz – Alex Guteniev May 26 '20 at 06:59
@AlexGuteniev no idea what the point of your Godbolt link is; you promise alignment then break that promise, creating UB. Were you hoping the compiler would warn for that? I had been hoping that `alignof(*p)` or `__alignof__(*p)` would reflect the `__builtin_assume_align` promised alignment, letting you detect something the compiler knew at compile time. Since that's not the case ( https://godbolt.org/z/tcFh9f), you'd need to implement such a check inside GCC (or MSVC), not with C++ in a header, unless there's other syntax available. – Peter Cordes May 26 '20 at 07:12
GCC documents `__builtin_has_attribute` (https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html has an example showing `__builtin_has_attribute (x, aligned (4)` as a static_assert operand), but that example syntax doesn't compile when I tried it with GCC trunk on Godbolt. – Peter Cordes May 26 '20 at 07:13
My point was exactly about `__builtin_assume_align` not producing usual `aligned` attribute. It probably produces some sort of _shadow_ aligned attribute (visible for compiler optimization, but not visible by C++ type system; similar to the way like compile-time constants propagate far beyond mandated by type system). So, there's no way of checking it with `__builtin_has_attribute`, but `__builtin_assume_align` itself could have checked if the promise makes sense, and it could have included other `__builtin_assume_align` promises into consideration. – Alex Guteniev May 26 '20 at 07:25
@AlexGuteniev: You can `typedef int aligned_int __attribute__((aligned(8)))` if you want, and cast to a pointer to that, instead of back to plain `int*`. (Although I think that gives it a sizeof of 8, so an `aligned_int[]` array would have the guaranteed alignment for every member. This *is* what you want if you're promising that a 3-byte struct is aligned by 4 or something. Or there's no problem for promising that `long long` is naturally aligned.) – Peter Cordes May 26 '20 at 07:32

atomic_ref when external underlying type is not aligned as requested

1 Answers1

Linked