Reusing a float buffer for doubles without undefined behaviour

Question

In one particular C++ function, I happen to have a pointer to a big buffer of floats that I want to temporarily use to store half the number of doubles. Is there a method to use this buffer as scratch space for storing the doubles, which is also allowed (i.e., not undefined behaviour) by the standard?

In summary, I would like this:

void f(float* buffer)
{
  double* d = reinterpret_cast<double*>(buffer);
  // make use of d
  d[i] = 1.;
  // done using d as scratch, start filling the buffer
  buffer[j] = 1.;
}

As far as I see there's no easy way to do this: if I understand correctly, a reinterpret_cast<double*> like this causes undefined behaviour because of type aliasing, and using memcpy or a float/double union is not possible without copying the data and allocating extra space, which defeats the purpose and happens to be costly in my case (and using a union for type punning is not allowed in C++).

It can be assumed the float buffer is correctly aligned for using it for doubles.

Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/174833/discussion-on-question-by-andre-offringa-reusing-a-float-buffer-for-doubles-with). — Bhargav Rao, Jul 11 '18 at 18:38
When you start filling the buffer with floats, do you need each double element to remain valid until it is overwritten? Or can the entire scratch buffer of doubles be thrown away at once? — Maxpm, Jul 17 '18 at 23:45

phön · Accepted Answer · 2018-07-18T06:58:08.180

10

I think the following code is a valid way to do it (it is really just a small example about the idea):

#include <memory>

void f(float* buffer, std::size_t buffer_size_in_bytes)
{
    double* d = new (buffer)double[buffer_size_in_bytes / sizeof(double)];

    // we have started the lifetime of the doubles.
    // "d" is a new pointer pointing to the first double object in the array.        
    // now you can use "d" as a double buffer for your calculations
    // you are not allowed to access any object through the "buffer" pointer anymore since the floats are "destroyed"       
    d[0] = 1.;
    // do some work here on/with the doubles...


    // conceptually we need to destory the doubles here... but they are trivially destructable

    // now we need to start the lifetime of the floats again
    new (buffer) float[10];  


    // here we are unsure about wether we need to update the "buffer" pointer to 
    // the one returned by the placement new of the floats
    // if it is nessessary, we could return the new float pointer or take the input pointer
    // by reference and update it directly in the function
}

int main()
{
    float* floats = new float[10];
    f(floats, sizeof(float) * 10);
    return 0;
}

It is important that you only use the pointer you receive from placement new. And it is important to placement new back the floats. Even if it is a no-operation construction, you need to start the lifetimes of the floats again.

Forget about std::launder and reinterpret_cast in the comments. Placement new will do the job for you.

edit: Make sure you have proper alignment when creating the buffer in main.

Update:

I just wanted to give an update on things that were discussed in the comments.

The first thing mentioned was that we may need to update the initially created float pointer to the pointer returned by the re-placement-new'ed floats (the question is whether the initially float pointer can still be used to access the floats, because the floats are now "new" floats obtained by an additional new expression).

To do this, we can either a) pass the float pointer by reference and update it, or b) return the new obtained float pointer from the function:

a)

void f(float*& buffer, std::size_t buffer_size_in_bytes)
{
    double* d = new (buffer)double[buffer_size_in_bytes / sizeof(double)];    
    // do some work here on/with the doubles...
    buffer = new (buffer) float[10];  
}

b)

float* f(float* buffer, std::size_t buffer_size_in_bytes)
{
    /* same as inital example... */
    return new (buffer) float[10];  
}

int main()
{
    float* floats = new float[10];
    floats = f(floats, sizeof(float) * 10);
    return 0;
}

The next and more crucial thing to mention is that placement-new is allowed to have a memory overhead. So the implementation is allowed to place some meta data infront of the returned array. If that happens, the naive calculation of how many doubles would fit into our memory will be obviously wrong. The problem is, that we dont know how many bytes the implementation will aquire beforehand for the specific call. But that would be nessessary to adjust the amounts of doubles we know will fit into the remaining storage. Here ( https://stackoverflow.com/a/8721932/3783662 ) is another SO post where Howard Hinnant provided a test snippet. I tested this using an online compiler and saw that for trivial destructable types (for example doubles), the overhead was 0. For more complex types (for example std::string), there was an overhead of 8 bytes. But this may varry for your plattform/compiler. Test it beforehand with the snippet by Howard.
For the question why we need to use some kind of placement new (either by new[] or single element new): We are allowed to cast pointers in every way we want. But in the end - when we access the value - we need to use the right type to avoid voilating the strict aliasing rules. Easy speaking: its only allowed to access an object when there is really an object of the pointer type living in the location given by the pointer. So how do you bring objects to life? the standard says:

https://timsong-cpp.github.io/cppwp/intro.object#1 :

"An object is created by a definition, by a new-expression, when implicitly changing the active member of a union, or when a temporary object is created."

There is an additional sector which may seem interesting:

https://timsong-cpp.github.io/cppwp/basic.life#1:

"An object is said to have non-vacuous initialization if it is of a class or aggregate type and it or one of its subobjects is initialized by a constructor other than a trivial default constructor. The lifetime of an object of type T begins when:

storage with the proper alignment and size for type T is obtained, and
if the object has non-vacuous initialization, its initialization is complete"

So now we may argue that because doubles are trivial, do we need to take some action to bring the trivial objects to life and change the actual living objects? I say yes, because we initally obtained storage for the floats, and accessing the storage through a double pointer would violate strict aliasing. So we need the tell the compiler that the actual type has changed. This whole last point 3 was pretty controversial discussed. You may form your own opinion. You have all the information at hand now.

edited Jul 18 '18 at 06:58

answered Jul 11 '18 at 16:14

phön

1,215
8
20

You need to at least pass back the resulting `float *` pointer of `new` from `f`. Or use launder. – geza Jul 11 '18 at 16:17
Valid argument. But i am not sure. I would have to pass the `float* buffer` by reference to assign the last placement new to it. Or we can omit it, because we created back the state in which we called the function. I am not sure about it – phön Jul 11 '18 at 16:20
1

I don't think that "created back the state" is possible in this case. But maybe I'm wrong. Perhaps it is worth a new SO question :) – geza Jul 11 '18 at 16:23
4

The other problem with this approach is that `new[]` can have an overhead. That's why I recommended the reinterpret_cast+launder approach: call non-array placement new N times, then use launder to get a "clean" pointer to `double *`. – geza Jul 11 '18 at 16:37
1

@geza: https://stackoverflow.com/questions/8720425/array-placement-new-requires-unspecified-overhead-in-the-buffer – Mooing Duck Jul 11 '18 at 17:01
So there is an ABI problem; array construction can take more space than sizrof T times countof elements. This *tends* not to be the case with plain old data; but the C++ standard permits it. Now, I am unaware of a single compiler that has overhead on a placement new array of doubles; on the other hand, I have no idea how to detect if there will be, and if there is the above is UB. – Yakk - Adam Nevraumont Jul 11 '18 at 18:02
@geza placement new (for arrays, specifically) has no overhead as cppreference will tell you. Why would it? So none of those tricks are necessary. – Paul Sanders Jul 11 '18 at 22:18
@phön Sadly, for both our posts, I've come to the conclusion that it isn't. Please see the edit I have made to my original answer to see why. – Paul Sanders Jul 12 '18 at 04:28
@PaulSanders: The authoritative reference is the standard: http://eel.is/c++draft/expr#new-15. It says: "This overhead may be applied in all array new-expressions, including those referencing the library function operator new[](std::size_t, void*) and other placement allocation functions. The amount of overhead may vary from one invocation of new to another.". But you are right in practice, implementations usually don't use this overhead in the current case. – geza Jul 12 '18 at 05:53
@geza Yeah, I saw that. Sad. That's why I now agree with you that calling placement new on a per-object basis is the right thing to do for non-POD types to get all the constructors called. Is there anything like `std::is_pod` in type_traits? Because for POD types you can skip all that and just call placement new once (to get a hold of that pointer). – Paul Sanders Jul 12 '18 at 05:59
Now that I understand all this a bit better, I see a lot of muddled thinking in this post, sorry. In particular, `new (buffer)double[buffer_size_in_bytes / sizeof(double)];` might break, see comments elsewhere in this thread. It's not tit-for-tat. – Paul Sanders Jul 12 '18 at 06:06
@PaulSanders: "for POD types you can skip all that and just call placement new once (to get a hold of that pointer)." Why do you think that's true? – geza Jul 12 '18 at 06:07
@geza Why would it not be true? 1. There's no actual allocation happening. 2. There are no constructors to be called. You just want to get a typed pointer to the start of the buffer, and `new (buf) T` does that nicely. – Paul Sanders Jul 12 '18 at 06:12
@PaulSanders yeah gz. now you have a pointer to the buffer where one object is alive. on the other indexes (if its valid to point to them) you have a strict aliasing violation – phön Jul 12 '18 at 06:24
@PaulSanders: then you could have easily `reinterpret_cast` too. Of course, it is UB too. You need to call placement `new` for **every** element. Or, if you want to have a float **array**, then you need to call placement `new[]`. See: http://eel.is/c++draft/intro.object#1. It is exactly defined by the standard, how can one create an object properly. – geza Jul 12 '18 at 06:28
@geza They're not objects, they're POD. So there's nothing to do and that section of the standard doesn't apply (and the _storage_ for the elements already exists, obviously). – Paul Sanders Jul 12 '18 at 06:31
@geza the only thing i am concerned about is the placement new[] for arrays. Mooning Duck showed another SO question which address this: https://stackoverflow.com/q/8720425/3783662 dont know wether its okay now for placement new to be called on an previously already new[]ed array – phön Jul 12 '18 at 06:31
@PaulSanders they are objects. – phön Jul 12 '18 at 06:36
@phön I think you'll find that's _MooingDuck` (although I quite like Mooning Duck myself and I imagine he might too). Ah me, noise. – Paul Sanders Jul 12 '18 at 06:36
@phön No they are not, see https://stackoverflow.com/a/146454/5743288. More noise, please check your facts before you post. – Paul Sanders Jul 12 '18 at 06:39
@PaulSanders what should this SO answer tell me? :-D It explains what POD are. nice? It doesnt provide anything for this discussion. Your point is that you can access any storage through a pointer of pod type. And in your answer you are doing excatly that. You access it as float and after that as double for example. But that is not allowed. That is a strict aliasing violation. You may only access the objects through a typed pointer if an object of that type is currently alive at this address. And to make sure that things are alive, you need to construct them (even if the constructor is no-op) – phön Jul 12 '18 at 06:48
@phön I'm wondering if someone else will answer this for me. If they don't, I will. – Paul Sanders Jul 12 '18 at 07:25
I like this solution the most, as it allows unrestricted access within the lifetime of the `double*` (see also my comments in geza's post). However, do I understand correctly that it can theoretically be UB in some implementations because placement `new` can have an unknown overhead? (even though no known implementation has overhead). Could you add this as a comment to your answer? And could you maybe also explain briefly, e.g. by quoting the standard, why placement new is a proper solution? – André Offringa Jul 12 '18 at 07:55
@AndréOffringa "However, do I understand correctly that it can theoretically be UB in some implementations because placement new can have an unknown overhead?" i have to admint i wasnt aware of that. i followed the question from link and used Howard Hinnants code to test the array overhead. turns out, that for double there is none, for the strings there is an 8 byte overhead. maybe its only no overhead for trivial types on the implementation where i tested it. i have to further investigate it. i will later on edit some links for the placement new and aliasing rules into my answer. – phön Jul 12 '18 at 09:10
1

@AndréOffringa maybe the only portable solution would be to placement new each element seperate (like a `std::vector` does it). you could make a class and use the subscript operation to use it like one would use std::vector or an array. but in the end it would just not be a plain double[...] array. the solution would look similar to the one of geza and brings some boilerplate code with it. i will update my answer when i got a little bit more time. – phön Jul 12 '18 at 09:13

geza · Answer 2 · 2018-07-12T05:40:04.263

7

You can achieve this in two ways.

First:

void set(float *buffer, size_t index, double value) {
    memcpy(reinterpret_cast<char*>(buffer)+sizeof(double)*index, &value, sizeof(double));
}
double get(const float *buffer, size_t index) {
    double v;
    memcpy(&v, reinterpret_cast<const char*>(buffer)+sizeof(double)*index, sizeof(double));
    return v;
}
void f(float *buffer) {
    // here, use set and get functions
}

Second: Instead of float *, you need to allocate a "typeless" char[] buffer, and use placement new to put floats or doubles inside:

template <typename T>
void setType(char *buffer, size_t size) {
    for (size_t i=0; i<size/sizeof(T); i++) {
        new(buffer+i*sizeof(T)) T;
    }
}
// use it like this: setType<float>(buffer, sizeOfBuffer);

Then use this accessor:

template <typename T>
T &get(char *buffer, size_t index) {
    return *std::launder(reinterpret_cast<T *>(buffer+index*sizeof(T)));
}
// use it like this: get<float>(buffer, index) = 33.3f;

A third way could be something like phön's answer (see my comments under that answer), unfortunately I cannot make a proper solution, because of this problem.

edited Jul 12 '18 at 05:40

answered Jul 11 '18 at 18:15

geza

28,403
6
61
135

And now ... please see the edit I have made to my original answer. It's a slippery one, this. – Paul Sanders Jul 12 '18 at 04:24
We can fix [this problem](https://stackoverflow.com/questions/51291661/is-it-possible-to-recreate-an-array-which-has-been-destroyed-by-placement-new) but we're no further forward as we are still aliasing pointers to different types. We need `std::pun`!! :) – Paul Sanders Jul 12 '18 at 06:15
@PaulSanders if you actually create the objects and start their lifestimes, then there is no aliasing problem. thats why you start the lifetime – phön Jul 12 '18 at 06:22
@phön What makes you say that? Of course there is. Aliasing pointers is nothing to do with object lifetimes. More noise. – Paul Sanders Jul 12 '18 at 07:27
@PaulSanders Aliasing is all about lifetimes. If i place an int into the storage, i may access it through an int pointer. If i place a float afterwards, i may access the storage trough a float pointer. As soon as i place the float, i may no longer use the int pointer to access the int (because there is no int alive). Thats it. And to place a float into the storage, you call the placement new for a float. – phön Jul 12 '18 at 07:32
@phön Too kind. I don't plan to answer your other question, sorry. Maybe someone else will set you straight, or you could [read a good book](https://stackoverflow.com/questions/388242/the-definitive-c-book-guide-and-list). – Paul Sanders Jul 12 '18 at 07:39
Thanks, I gave an upvote: I was not aware of these solutions, nice to learn about them. However, they do not solve my problem completely, as both do not allow unrestricted access to the array as doubles, which is what I would like. In the part where I use the array as doubles, I would also like to be able to call for example other functions taking a ptr to doubles, which is not possible without also rewriting those functions. – André Offringa Jul 12 '18 at 07:47
@PaulSanders which other question? my only question is how someone can swich his opinioins and answers as fast as you do, and in the end he thinks he got the right answer because he thought 10 mins about it. And multiple people still telling him he is wrong but he wont accept it – phön Jul 12 '18 at 07:55
1

@AndréOffringa: Unfortunately, I don't think that's possible in a strictly standard conformant way. But as far as I know, currently, it is fine to use phön's solution in my opinion. Check out whether your compiler adds overhead to placement `new[]`, it is almost sure that it won't. So I'd go with that way, if I had to solve your problem. I've added my answer just have a strictly standard conformant way. But, I'm starting to think that my first way is flawed, and it has UB, so I'll check it out. – geza Jul 12 '18 at 08:03
@AndréOffringa My alternative answer will give you that, if you trust it. As Bathsheba says, it's all probably fine, I just could not swear to it. – Paul Sanders Jul 12 '18 at 08:22
@phön I don't know if you've noticed, but the only person who is telling me I am wrong is you, and I don't intend to listen to you anymore, sorry. The rest of us are having quite an interesting discussion (and learning new things). Join it - by asking questions, not by making statements. – Paul Sanders Jul 12 '18 at 08:42
@AndréOffringa: it seems that my first way is not flawed, see: https://stackoverflow.com/questions/51300626/is-stdmemcpy-between-different-trivially-copyable-types-undefined-behavior – geza Jul 12 '18 at 16:03

score 2 · Answer 3 · answered Jul 18 '18 at 00:06

Here's an alternative approach that's less scary.

You say,

...a float/double union is not possible without...allocating extra space, which defeats the purpose and happens to be costly in my case...

So just have each union object contain two floats instead of one.

static_assert(sizeof(double) == sizeof(float)*2, "Assuming exactly two floats fit in a double.");
union double_or_floats
{
    double d;
    float f[2];
};

void f(double_or_floats* buffer)
{
    // Use buffer of doubles as scratch space.
    buffer[0].d = 1.0;
    // Done with the scratch space.  Start filling the buffer with floats.
    buffer[0].f[0] = 1.0f;
    buffer[0].f[1] = 2.0f;
}

Of course, this makes indexing more complicated, and calling code will have to be modified. But it has no overhead and it's more obviously correct.

Paul Sanders · Answer 4 · 2018-07-12T05:19:43.043

0

tl;dr Don't alias pointers - at all - unless you tell the compiler that you're going to on the command line.

The easiest way to do this might be to figure out what compiler switch disables strict aliasing and use it for the source file(s) in question.

Needs must, eh?

Thought about this some more. Despite all that stuff about placement new, this is the only safe way.

Why?

Well, if you have two pointers of different types pointing to the same address then you have aliased that address and you stand a good chance of fooling the compiler. And it doesn't matter how you assigned values to those pointers. The compiler is not going to remember that.

So this is the only safe way, and that's why we need std::pun.

edited Jul 12 '18 at 05:19

answered Jul 11 '18 at 15:40

Paul Sanders

24,133
4
26
48

What if that compiler doesn't have that switch? What if I want to have this functionality as inline functions? Should I turn off strict aliasing for the whole project? This answer can be a solution for some cases, but not in others. And this is false: "only safe way". My answer tells other safe ways as well. Which are standard conformant, and works in all cases. – geza Jul 12 '18 at 04:32
@geza What if, what if, what if. Chances are, the OP can use this trick. – Paul Sanders Jul 12 '18 at 05:03
@geza Oh yes, you're idea of calling placement new in a loop is a good one (and I see now that it is the right thing to do). I will reverse my vote, I was wrong about some stuff. ... Ah blast it, I can't. Edit your post (anything will do) and then ping me. Then I can. – Paul Sanders Jul 12 '18 at 05:36
@geza TU, you end up plus :) – Paul Sanders Jul 12 '18 at 05:41
@geza No, it's fine, and you're not wrong, to some extent at least. But it wasn't me that upvoted you, it was SO. It works that way if you downvote and then upvote, not sure why. – Paul Sanders Jul 12 '18 at 05:48
The suggestion to use a compiler switch to disable strict aliasing is a good one -- thanks! I'm not convinced of your argument that the other answers are not correct: you're saying "if you have two pointers of different types pointing to the same address then you have aliased that address and you stand a good chance of fooling the compiler", however, if one does `new`, `delete`, `new`, the 2nd `new` also might receive the same address to a different type, and clearly the compiler should handle that properly. So it makes sense that placement new indicates the same "change" in lifetime. – André Offringa Jul 12 '18 at 07:42
@AndréOffringa TBH, I'm not certain what is _not_ safe. I only know what for sure _is_ safe. – Paul Sanders Jul 12 '18 at 08:25
@geza: Any compiler that doesn't have such an option should be viewed as being suitable only for processing a specialized subset of C programs that never need to re-purpose storage within its allocated lifetime. Such compilers may be excellent for the purposes of processing the programs for which they were designed, but they cannot reliably process programs that need to reuse storage within its allocated lifetime, even if the programmers abide by the Effective Type rules. I have yet to see a compiler that abides by the Effective Type rules in every case that doesn't also work in many other... – supercat Jul 12 '18 at 22:33
...easily-identifiable cases where the Standard imposes no requirements. It's too bad the Rationale's comments about Quality of Implementation issues weren't incorporated more visibly into the Standard, since the only way many parts of the Standard make any sense is if one recognizes that some forms of UB were expected to be processed predictably on quality implementations intended for some targets and purposes, even if implementations that are of low-quality or specialized for other purposes might behave unpredictably. – supercat Jul 12 '18 at 22:41
@supercat: There are **a lot** of program which compiles without `-fno-strict-aliasing`. And I bet, that a lot of programs, which uses this switch, can be modified to not use it. So I think exactly the opposite. This switch is very rarely needed. What do you mean by this? "I have yet to see a compiler that abides by the Effective Type rules in every case that doesn't also work in many other easily-identifiable cases where the Standard imposes no requirements." – geza Jul 13 '18 at 05:33
@geza _There are a lot of program which compiles without `-fno-strict-aliasing`_ Yes, I've never needed this switch and I tend to play a bit fast-and-loose with the types of pointers sometimes. Truth is, I think you have to do something a bit crazy for anything to actually go wrong when the compiler optimises your code. The compiler guys are not idiots. – Paul Sanders Jul 13 '18 at 06:03
@geza: Both gcc and clang are prone to assume that code which would have no effect in `-fno-strict-aliasing` mode will also have no effect in `-fstrict-aliasing` mode. For example, assume both `long` and `long long` are 64 bits. Given "void convert_long_to_longlong(void *p, int n) { long *lp = p; long long *lp'; for (i=0; i – supercat Jul 13 '18 at 15:00
...use a similar function to convert the array back before calling the first function again. Unfortunately, I don't think gcc or clang have any way for their optimizer to handle the notion of a function which converts something from one type to another without physically reading or writing any bytes thereof. I don't think there's any guaranteed-safe way on those compilers, without using compiler-specific directives, to take storage that has been used as one type and reuse it as another, without either using `volatile` to force wasteful reads and writes. – supercat Jul 13 '18 at 15:06
@PaulSanders: There are many constructs that will "usually" work on gcc and clang, but which such compilers don't handle 100% reliably. If the authors of gcc and clang were interested in producing a high-quality compiler suitable for low-level or systems programming, they should seek to maximize the range of programs that can be safely processed without having to disable essentially all optimizations. From what I've seen of their mailing list, they'd rather find excuses to regard any program with which they're incompatible as "broken", even though... – supercat Jul 13 '18 at 15:13
...every quality implementation must be able to support constructs beyond those mandated in 6.5p7 is a Quality of Implementation issue. A conforming-but-garbage-quality compiler, given `struct s {int x;} s1={0},s2; s1.x=3; s2=s1;`, could assume that the value of `s1` would not be affected by an lvalue of type `int`, and thus store `(struct s){0}` into s2. The only way the Standard would make any sense would be if use of an lvalue which is recognizably derived from another should be recognized as an use of the original, with the ability to recognize derived lvalues as a QoI issue. – supercat Jul 13 '18 at 15:25
@PaulSanders: IMHO, two pointers or lvalues can only meaningfully be said to alias during the execution of a particular function or loop if both are actually used in some function during that execution, and quality implementations should only care about 6.5p7 in cases that would actually involve aliasing [consider the footnote]. Put those together, and I'd say that even without the non-conforming bugs, the `-fstrict-aliasing` dialect should not be regarded as a high-quality dialect except perhaps for specialized application fields. – supercat Jul 13 '18 at 15:34

score 0 · Answer 5 · answered Jul 12 '18 at 07:12

This problem cannot be solved in portable C++.

C++ is strict when it comes to pointer aliasing. Somewhat paradoxically this allows it to compile on very many platforms (for example where, perhaps double numbers are stored in different places to float numbers).

Needless to say, if you are striving for portable code then you'll need to recode what you have. The second best thing is to be pragmatic, accept it will work on any desktop system I've come across; perhaps even static_assert on compiler name / architecture.

Interesting post [here](https://blog.regehr.org/archives/1307). And I think the big 3 have the requisite command line switch(es), so, for all practical purposes, the problem is soluble. — Paul Sanders, Jul 12 '18 at 07:33

Reusing a float buffer for doubles without undefined behaviour

5 Answers5

Linked