0

I am working on implementing a std::function myself. To do small object optimization I need to implement a storage which maybe store the object locally. As far as I know, the strict aliasing rule allows any type of data be stored in a bytes array, but prohibit extracting the corresponding type of value from the store array, unless use the std::launder. std::launder was introduced in cpp17 but I hope my codes can run in cpp14 and even cpp11. So I read the implementation of GCC and CLang. in GCC they defined a union _Any_data as the storage, which has the attrbuite may_alias, looks like I should supposed it can violate the strict aliasing. And in CLang they use an additional pointer to store the result of new, like

if (sizeof(_Fun) <= sizeof(__buf_) &&
    is_nothrow_copy_constructible<_Fp>::value &&
    is_nothrow_copy_constructible<_FunAlloc>::value)
{
    __f_ = ::new ((void*)&__buf_) _Fun(_VSTD::move(__f), _Alloc(__af));
}
else
{
    typedef __allocator_destructor<_FunAlloc> _Dp;
    unique_ptr<__func, _Dp> __hold(__af.allocate(1), _Dp(__af, 1));
    ::new ((void*)__hold.get()) _Fun(_VSTD::move(__f), _Alloc(__a));
    __f_ = __hold.release();
}

I have searched many questions and answers on SO, such answer like https://stackoverflow.com/a/41625067/11139119 said it's legal to use the pointer returned by the placement new. But it must cost other 8-bytes space to store a pointer, which is a waste in a way, especially GCC only costs 32-bytes on the size of std::function. So my question is, before cpp14, how I can avoid violating strict aliasing without spending 8-bytes space on an extra pointer?

The following codes are the part of my small object optimization storage. I have no idea about how to implement get() without incurring undefined behavior before cpp17.

class StoragePool {
 public:
  StoragePool() = default;

  template <class Tp, class... Args, std::enable_if_t<soo::IsSmallObject<Tp>::value, int> = 0>
  void emplace(Args&&... args) noexcept(noexcept(Tp(std::forward<Args>(args)...))) {
    new (mem_) Tp(std::forward<Args>(args)...);
  }

  template <class Tp, class... Args, std::enable_if_t<!soo::IsSmallObject<Tp>::value, int> = 0>
  void emplace(Args&&... args) {
    auto tmp = new (mem_) Tp*;
    *tmp = new Tp(std::forward<Args>(args)...);
  }

  template <class Tp, std::enable_if_t<soo::IsSmallObject<Tp>::value, int> = 0>
  Tp* get() noexcept {
#if !(__cplusplus < 201703L)
    return std::launder(reinterpret_cast<Tp*>(mem_));
#else
    // ...
#endif
  }

  template <class Tp, std::enable_if_t<!soo::IsSmallObject<Tp>::value, int> = 0>
  Tp* get() noexcept {
#if !(__cplusplus < 201703L)
    return *std::launder(reinterpret_cast<Tp**>(mem_));
#else
    // ...
#endif
  }

  // other functions are omitted because they are unimportant in this question.

 private:
  alignas(soo::small_object_alignment) unsigned char mem_[soo::small_object_size];
};
youtao guo
  • 73
  • 1
  • 4
  • 2
    `std::launder` doesn't do anything here. `std::launder` is not a magic want that makes any `reinterpret_cast` valid. – bolov Feb 01 '23 at 20:16
  • Note that `std::launder` does not let you bypass the strict aliasing rule. The strict aliasing rule prohibits accessing an object through a glvalue that doesn't match the object's type. `std::launder` is used when you already have a pointer of the correct type, but you need to convince the compiler that it actually points to the object that you're trying to access. – Brian Bi Feb 02 '23 at 00:43
  • @BrianBi It makes me more confused. In https://en.cppreference.com/w/cpp/utility/launder it says I should use `std::launder` to obtain an object created by placement new from the pointer of the object which provides the storage. If `std::launder` cannot bypass strict aliasing, is it still an UB if I access the object from the pointer returned by `std::launder`? – youtao guo Feb 02 '23 at 01:31
  • @bolov Can you explain more details? I found that the example of `std::aligned_storge` in https://en.cppreference.com/w/cpp/types/aligned_storage is very similar to what I need to implement. And it uses `std::launder`. If `std::launder` do nothing, why we still need it there? – youtao guo Feb 02 '23 at 01:36
  • 1
    If you've `reinterpret_cast`ed the pointer to the correct type `T*` (and there is an object of type `T` that you are trying to access) then you are not violating the strict aliasing rule by dereferencing that pointer, because the types match. However `std::launder` is still necessary under some conditions, as the article explains. – Brian Bi Feb 02 '23 at 13:38
  • I might be wrong. `std::launder` deals with fairly complicated standard rules. – bolov Feb 02 '23 at 14:11
  • @bolov It isn't "complicated": by a strict reading of the std, it's clear that `std::launder` does nothing and is useless on regular arch/CPU. (But compiler writers probably don't have a strict reading of the std.) So you have to make up rules of the std that aren't written anywhere but were intended, like with threads and atomics (which aren't explained in the C/C++ std). – curiousguy Mar 02 '23 at 22:29
  • @curiousguy `std::launder` is not about specific architectures. It's for the compiler and what it can assume. Its role is to tell the compiler that certain read/writes aren't UB. This is important because otherwise the compiler would assume UB and wouldn't be required to honor those read/writes (or it could make demons fly). And has nothing or almost nothing to do with the underlying architecture. – bolov Mar 03 '23 at 01:10
  • @bolov Are you claiming that `std::launder` is not an identity function on common (read: all useful) CPU arch? If it's the identity, it can't perform a function, it's a useless function. If you disagree you must assert that there is something non physical with the value returned, and being a ptr, **that ptrs aren't PODs/trivial types** in C and C++. Which they aren't really in practice in many compilers, they are something that was never clearly defined, but then you are at odds with the C++ std. – curiousguy Mar 03 '23 at 02:39
  • @curiousguy Yes you can call `std::launder` an identity function, it does translate indeed into a nop in the final executable (I think on all architectures). That doesn't mean it's not useful. Its purpose is to add instruction to the compiler on how to interpret further code accessing the value returned by it. It's like an annotation if you will for the compiler. – bolov Mar 03 '23 at 13:19
  • The same way [`std::assumed_aligned`](https://en.cppreference.com/w/cpp/memory/assume_aligned) does not modify the pointer in any way. It's just adds information to the compiler so that it can generate optimized instructions for further code accessing that pointer. – bolov Mar 03 '23 at 13:24
  • 1
    I would argue it is definitely not an identity function. A pointer (in the C++ machine) is more than an address. This is inadequately specified of course. – Jeff Garrett Mar 03 '23 at 13:58
  • @JeffGarrett I'm angry at the claims of C and C++ of being BOTH low level and high level, of being "portable asm" while being specified in term of "abstract machine" and "observable behavior". It's a fundamental conflict, very hard to solve. In a PL with a `sizeof` operator, we have user handling of memory allocation, sub allocation to the level of individual byte. There was a pathetic "debate" on GCC bug tracker on whether you could even implement your own "user_malloc" in std conforming code. It's insane that ppl can't ALL agree on that stuff. Many basic Q have experts disagreeing. – curiousguy Mar 04 '23 at 03:19
  • @bolov "_That doesn't mean it's not useful._" I never claimed it couldn't be useful with actual compilers. I wrote that it was useless on common arch/CPU, by definition, following the std. I say that **from a formal, std, language-laywer POV, it can't accomplish anything** on common CPU. "_otherwise the compiler would assume UB_" You are assuming a ptr isn't just an number, but I assert it is just that. To disagree with me, **you must somehow reject the idea that all ptr are trivial types**. It's a very hard language problem, and I have nothing to offer. – curiousguy Mar 04 '23 at 05:51
  • @curiousguy an access through a pointer can be invalid or can be valid. `std::launder` changes (or adds if you will) validity to the accesses through the returned pointer value. – bolov Mar 04 '23 at 06:48
  • @bolov So you have an object with a trivial type yet it's behavior isn't determined by its bit pattern. Then I don't understand the semantics of `memcpy` and co. on such objects. Horrible can of worms. – curiousguy Mar 04 '23 at 07:56

1 Answers1

0

The short answer is: In C++17, you need to std::launder, but in C++14 and below you don't (just return reinterpret_cast<Tp*>(mem_);).

In C++14 and before, pointers simply represented addresses, and the reinterpret_casted pointer with the correct value would point to the object automatically, so reinterpret_cast<Tp*>(mem_) is fine. Two pointers where p == q would always be interchangeable.

In C++17, the semantics of pointers where changed so that pointers could have equal value but still point to different objects, and std::launder would "fix" a pointer which has the correct address but does not point to the correct object (because reinterpret_cast<Tp*>(mem_) would only have the same address but would not point to the object you created with placement new).

Artyer
  • 31,034
  • 3
  • 47
  • 75
  • 3
    "*In C++14 and before, pointers simply represented addresses*" This is not true. Those standards were *muddled* on this point. Sometimes they treated pointers like addresses and sometimes they weren't. C++17 resolved the inconsistent wording to move towards what was always intended. – Nicol Bolas Feb 01 '23 at 21:28
  • I noticed you didn't mention the strict aliasing, and other comments mentioned neither. Can I ask if my problem really has to do with strict aliasing rule? Or, in other words, does my codes violate the strict aliasing by accessing the value through the pointer of other than type of unsigned char? – youtao guo Feb 02 '23 at 01:48
  • "_you need to std::launder_" Unless you claim that ptrs aren't PODs or the modern equivalent "trivial", that can't be true. You are really claiming the std is lying here. – curiousguy Mar 02 '23 at 22:31