1

It seems I might be using std::valarray<_Tp>s for some computational work (suppose _Tp is uint64_t). Unfortunately, the following hold:

  • my code receives raw arrays - uint64_t*s and a length value -
  • I can't change signatures/APIs. They're __restrict__ed though.
  • std::valarray's constructor which takes a _Tp* and a length copies the entire array.
  • There do not seem to be methods for setting std::valarray's internal data; it's even private so you can't access it in a subclass.

So how do I break this Gordian knot and construct a valarray without copying my data?

einpoklum
  • 118,144
  • 57
  • 340
  • 684
  • I don't think there's any way to do that. What `valarray` functionality do you need? You might be better off implementing that functionality for your raw array. – Praetorian Apr 22 '15 at 22:07
  • @Praetorian: I've [been told](http://stackoverflow.com/q/29807944/1593077) that valarray is a decent choice for performing bitwise operations on packed sequences of bits. – einpoklum Apr 22 '15 at 22:18
  • Are you using MSVC (I think they're the ones that allow the `__restrict__` extension)? If so, take a look at [this](http://stackoverflow.com/q/6850807/241631), in particular, the second answer and Howard Hinnant's comment under the accepted answer. Have you inspected the assembly from an optimized build with a plain `for` loop bit ANDing one element at a time? It's possible the auto vectorizer will be able to transform that into SIMD instructions. – Praetorian Apr 22 '15 at 22:34
  • Seems I was remembering wrong, that's a gcc/clang extension. [clang3.5](http://coliru.stacked-crooked.com/a/704455603de3ae52) seems willing to use SIMD if you write a plain `for` loop. I couldn't manage to get gcc to make use of the xmm registers, but you might have better luck with additional optimization flags/pragmas. – Praetorian Apr 22 '15 at 22:49
  • @Praetorian: Your comments belong more to the question I linked to than here... at any rate, I'm using gcc/Linux for now. – einpoklum Apr 22 '15 at 22:54

0 Answers0