Efficiently generating byte buffer without breaking strict aliasing

Question

This is such a simple pattern, there has to be a "nice" way of sorting it out.

I have a function that needs to generate a dynamically sized byte array containing arithmetic data.

// Given that I have a function that kinda looks like this:
void dispatch(std::vector<char> data); //Will take possesion of data.

// The behavior I want, but this breaks strict aliasing
void bad_foo(int c) {
  std::vector<char> real_data(c * sizeof(float));
  float* raw_data = reinterpret_cast<float*>(real_data.data());

  //Fill raw_data with usefull stuff...

  dispatch(std::move(real_data));
}

void correct_but_slow_foo(int c) {
  std::vector<float> raw_data(c);

  //Fill raw_data with usefull stuff...

  std::vector<char> real_data(c * sizeof(float));
  std::memcpy(real_data.data(), raw_data.data(), c * sizeof(float));

  dispatch(std::move(real_data));
}

Unfortunately, it seems even clang's heap elision is not managing to sort out what needs to be done here: see on godbolt

At the very worst, I can make dispatch() a template, but that would become very messy, and I'm curious to see if there is a way out of this mess I'm overlooking anyways.

Thanks!

Edit: A thought just crossed my mind (immediately after posting the question of course...) : I could treat real_data as an allocation pool and in-place new the arithmetic data on top of it:

void fixed_foo(int c) {
  std::vector<char> real_data(c * sizeof(float));
  float* raw_data = new (real_data.data()) float[c];

  //Fill raw_data with usefull stuff...

  dispatch(std::move(real_data));
}

This looks funky as hell, but I "think" it might be legal. Maybe?

What will `dispatch` do with `data`? If it accesses it as `float` or `char`, there is no problem, I think. — geza, Oct 26 '17 at 17:10
@geza It doesn't matter what dispatch does with the data, what `bad_foo()` does is a violation in of itself. — , Oct 26 '17 at 17:11
I'm not really sure that's the case. You only access that memory as `float`. Does it really violate strict aliasing rule? (I'm not saying it doesn't, I'm just skeptical) — geza, Oct 26 '17 at 17:17
@geza The rule is quite clear, the `char` exception allows you to cast a standard layout object to `char`, but not the other way around unless it was an object of that type in the first place. — , Oct 26 '17 at 17:19
@geza I just realized I never actually answered your question, sorry. `dispatch()` will eventually lead to the data being DMAd to a GPU, using something like `glBufferSubData()`. — , Oct 26 '17 at 17:42
Your question made me to ask this: https://stackoverflow.com/questions/46960774/is-using-the-result-of-new-char-or-malloc-to-casted-float-is-ub-strict-alia — geza, Oct 26 '17 at 17:56
What makes you think `data()` returns correctly aligned ptr? — curiousguy, Nov 04 '17 at 13:30
@curiousguy it is in all modern implementations ( as long as I use std::allocator).. It would theoricaly be possible for std::vector to overallocate and use the front of the data for something, but that would make little sense. A simple assert is enough safety as far as I'm concerned — , Nov 04 '17 at 14:33
@Frank Yeah, probably. However it's 1) not formally guaranteed (although I just noticed there is language lawyer tag on your question) and 2) I think it's worth mentioning (for the record) — curiousguy, Nov 05 '17 at 08:59

mattnewport · Answer 1 · 2017-10-26T17:19:14.443

1

The safest way to get around aliasing rules is to use memcpy() but you don't need to do that on a whole second copy of the data. I'd suggest doing all your float work on a local float variable and then memcpy()ing that to the appropriate location in your real_data buffer an element at a time. Most compilers will optimize that effectively in my experience.

void better_foo(int c) {
  std::vector<char> real_data(c * sizeof(float));

  //Fill raw_data with usefull stuff...
  for (int i = 0; i < c; ++i) {
    float x = my_complicated_calculation(i);
    memcpy(&real_data[i * sizeof(float)], &x, sizeof(x));
  }

  dispatch(std::move(real_data));
}

edited Oct 26 '17 at 17:19

answered Oct 26 '17 at 17:14

mattnewport

13,728
2
35
39

Glad you presented this option, even though it might not help if you are using some numeric library that requires a `float*` and works on the whole data set. – Ben Voigt Oct 26 '17 at 17:16
Thank you for this, it is indeed a very usefull workaround in many cases, but as @BenVoigt mentionned, it's not "quite" what I'd need ideally. – Oct 26 '17 at 17:20

Efficiently generating byte buffer without breaking strict aliasing

1 Answers1

Linked