7

I have a struct which contains two data members:

struct A{
    short b;
    short c;
};

and a vector containing many of these struct objects:

std::vector<A> vec;

at runtime the data members A.b and A.c from struct objects are set to zero/a value x. However, b and c must be modified at the same time- one cannot be updated separately without the other. I was planning on using an atomic compare_exchange_weak() to do the update.

I am not sure whether I should represent each struct as an std::atomic<A> in the vector, or whether I should place a union inside the struct, to combine the two shorts in to a single uint32_t and then modify this:

union A {
    struct {
         short b;
         short c;
    };

    uint32_t d;
};

What would be the best solution? Should I store a vector of:

std::vector<uint32_t>

and upon accessing each element, reinterpret_cast to A, to obtain d?

I would like the locking to be as least-intrusive as possible.

Portability is not required, this will be on Linux, 64-bit, x86, GCC 4.8+ compiler

user997112
  • 29,025
  • 43
  • 182
  • 361
  • I would declare `b` and `c` as private, and write a single function that change both of'em. – Quest Dec 02 '14 at 11:15
  • You can't really use `std::atomic<>` inside containers (atomics are not copyable or movable). If the number of elements in the vector is fixed and known in advance you can construct the vector with the correct number of elements, as long as you never try to insert/erase elements – Jonathan Wakely Dec 02 '14 at 11:15
  • Why do a `reinterpret_cast` from `uint32_t` (which would be undefined behaviour) instead of storing the union type in the vector and reading/writing the members directly? That relies on type punning, but that's supported by many compilers. That wouldn't make the updates atomic though, you need to use atomic operations for that. – Jonathan Wakely Dec 02 '14 at 11:17
  • Are `struct`s defined without padding if they are part of a `union`? Otherwise your proposal looks pretty risky ... – filmor Dec 02 '14 at 11:17
  • 3
    Personally I'd merge `b` and `c` into a single atomic integral type of the correct size. Use bitwise operations to set various parts of that type. Your union will not be portable due to structure packing. – Bathsheba Dec 02 '14 at 11:18
  • @Bathsheba could I not align the two 16 bit members to 2-byte boundaries? – user997112 Dec 02 '14 at 11:33
  • Not in a portable way, no. If you're sacrificing portability then you may as well resort to assembly and use a bus lock. – Bathsheba Dec 02 '14 at 11:42
  • @filmor, there is not allowed to be padding at the start of a standard layout struct, and there is very unlikely to be padding between two shorts on any common platform. A `static_assert` could be used to check that. – Jonathan Wakely Dec 02 '14 at 11:47
  • What happens when the `vector` resizes? It will do a copy, and no atomicity is guaranteed. It is almost certain that you do not **always** need the guarantee you ask for: when do you need that guarantee? Who else is accessing the data, and how? – Yakk - Adam Nevraumont Dec 02 '14 at 18:45

4 Answers4

4

Unless the hardware you are targetting supports double compare-and-swap (which is probably not the case), I think you only have two portable solutions:

  1. Introduce a higher-level lock (mutex or spinlock depending on your preference) and carry all operations on b and c within the scope of the acquired lock. A mutex is heavy, but std::atomic_flag is lock-free and very light-weight even in high-contention situations.

  2. Merge both members into a single std::atomic<int> and split that int into shorts through bit masking. Note that this requires sizeof(int) >= 2 * sizeof(short). Use fixed-size integer types if you need to enforce that.

To determine which solution is the fastest, benchmarks, of course.

If you know the number of struct A you will need at compile time, I'd suggest putting them into an std::array. If you don't, std::vector is fine as long as this number stays constant throughout the lifetime of the vector. Otherwise, since std::atomic<T> is neither copyable nor movable, you will have to write your own copy/move constructor for struct A.

user703016
  • 37,307
  • 8
  • 87
  • 112
  • I should have said- portability is not required. – user997112 Dec 02 '14 at 11:59
  • CPP Reference even has an example of a [spinlock](http://en.cppreference.com/w/cpp/atomic/atomic_flag) implemented using `atomic_flag`. – icabod Dec 02 '14 at 12:01
  • @Park what if I put struct A in a raw C-style array? – user997112 Dec 02 '14 at 12:39
  • If you know the number of elements at compile time, I'd suggest using `std::array` instead. If you don't, `std::vector` is fine as long as this number stays constant throughout the lifetime of the vector. Otherwise, you have to write a copy or move constructor. – user703016 Dec 02 '14 at 12:53
2

I recommend wrapping the variables in a class with a getter and setter guarded by a mutex, and make the variables private.

Using an union could cause unforeseen functionality based on machine architecture and compiler flags.

EDIT Results of running a simple program that stores values of the given struct type (Linux 32bit, x86):

  • Simple store (no protection at all) -> ~4000 us
  • Mutex guarded store -> ~12000 us
  • Using an union with an atomic aggregation field -> ~21000 us
Bogdan V.
  • 719
  • 4
  • 7
  • Mutex is WAY too heavy. This needs to be very fast :) – user997112 Dec 02 '14 at 11:37
  • Did you test it in your setup for impact? A simple test (1000.000 stores on a variable of the given type) i got: -simple store (aka. no protection) -> ~4000 us -mutex guarded store -> ~12000 us -atomic store using an union -> ~21000 us tested on a linux 32 bit, x86 – Bogdan V. Dec 02 '14 at 13:27
1

Simply make a union of a large enough atomic type. This is what I use (the code snippet is not perfectly portable, using <cstdint> types instead of short and int would surely be preferrable -- but it's good enough for me as it is), and it works perfectly fine and reliably since... practically forever:

union A {
    struct {
         short b;
         short c;
    };

    std::atomic<int> d;
};

(In fact, my implementation is slightly more complicated: I'm wrapping the whole thing into another struct out of habit, so A is a struct containing a union rather than being a union. Traditionally union had weird constraints about constructors, my initial implementation predates C++0x, and my A needs a constructor. But of course using C++11's <atomic> these considerations become alltogether obsolete, since those artificial constraints no longer exist)

Note that std::atomic may be lock-free but is not guaranteed to be (except for bool). In practice, for anything the size of int or short, it is lock-free on every "serious, no-joke" architecture, and on most modern architectures it's lock-free for something of pointer size, too (though there exist exceptions, notably the very first generation of x86_64 chips from AMD).

Damon
  • 67,688
  • 20
  • 135
  • 185
  • I should have said- portability is not required. – user997112 Dec 02 '14 at 11:58
  • @user: If you know for sure what size `int` and `short` are on your platform, and you don't plan to port the code to another one, then that's perfectly fine (that's why I'm using `int`, not `int32_t`, too). But even if it's a consideration or requirement, the changes are minuscule :-) – Damon Dec 02 '14 at 12:01
  • Yes, you would read the value of e.g. `vec[i].d` (or `iter->d`) into a temporary ("`expected`") and then do something like `vec[i].d.compare_exchange_weak(expected, desired);`. You can also use the "strong" version, but it really has not much of a benefit. Since it can fail either way, you must check the result and loop in any case, so although the "weak" version can spuriously fail, there isn't really much of a difference (but "weak" may be faster). Also note that you **must** restore `expected` if the op fails (it will be overwritten!). Forgetting to do so is a common source of pain. – Damon Dec 02 '14 at 14:40
  • I'm pretty sure `std::atomic` is lock-free on baseline x86-64. Are you thinking of the fact that CMPXCHG16B isn't baseline (because it was missing from K8)? You don't need it to compare_exchange_weak a *single* pointer; 64-bit CMPXCHG works fine for that, and compilers emit that even without `-mcx16`. (See [this answer](http://stackoverflow.com/questions/38984153/implement-aba-counter-with-c11-cas/38991835#38991835) for a similar union trick for separately atomic or as-a-whole atomic access to a pair of pointers (or pointer+flag). e.g. `atomic b,c`. – Peter Cordes Oct 26 '16 at 04:48
  • Also note that unlike C, in C++ it's not guaranteed to be portable to write one union member and then read another, so this is only portable to compilers where that's ok (e.g. any flavour of GNU C, and probably many others). – Peter Cordes Oct 26 '16 at 04:51
  • @PeterCordes: Saying "not guaranteed" is indeed very forgiving. A program doing that invokes undefined behavior (with the common prefix exception, which however most probably doesn't apply here). Nevertheless, this is a pragmatic solution that "works fine" under the obvious assumptions which are made here, such as two `short`s have no extra padding between them, and two `short`s are the size of one `int`. Generally, for unions of only POD or "mostly POD" types such as an atomic integer, this kind of UB "just works". It would obviously crash and burn given a type that allocates resources. – Damon Oct 26 '16 at 09:33
  • @Damon: The potential problems I was talking about are the same kind you can get from violating strict aliasing (e.g. by reading the second `short` with `(reinterpret_cast *>(&d)+1)->load()`. The program may appear to violate causality, and a store to one of the `short`s might happen eventually, but not be reflected in a read of the `atomic` right after. Obviously the layouts have to overlap the way you expect, but that's not all you need to guarantee that the type-punning through the union works. – Peter Cordes Oct 26 '16 at 10:10
0

Since portability is not an issue, the locking is only as intrusive as you'd like it to be (taking "intrusive" to mean CAS loop or excessive locking).

In your case, you could use the atomic builtins directly.

AFAIK, you should be able to mix word sizes to operate directly on b and c, as well as a. I've never done this, so I can't say if it would work reliably on all things x86. As always: Prototype and test, test, test!

defube
  • 2,395
  • 1
  • 22
  • 34
  • `std::atomic` is not copyable so it will be difficult to put `A` in a `vector` – Jonathan Wakely Dec 02 '14 at 11:45
  • You can copy an `atomic` using a load into its constructor. If you need to do a "batch copy" of one vector to another, you can use `reinterpret_cast`, just as long as you can guarantee the writes get flushed before any atomic operations are used again. – defube Dec 02 '14 at 11:51
  • Just an update guys- portability is not required. See bottom of Q. – user997112 Dec 02 '14 at 12:00
  • The builtins you link to are the "legacy" ones, they've been superseded by https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html – Jonathan Wakely Dec 02 '14 at 12:14