38

Is the underlying bit representation for an std::array<T,N> v and a T u[N] the same?

In other words, is it safe to copy N*sizeof(T) bytes from one to the other? (Either through reinterpret_cast or memcpy.)

Edit:

For clarification, the emphasis is on same bit representation and reinterpret_cast.

For example, let's suppose I have these two classes over some trivially copyable type T, for some N:

struct VecNew {
    std::array<T,N> v;
};

struct VecOld {
    T v[N];
};

And there is the legacy function

T foo(const VecOld& x);

If the representations are the same, then this call is safe and avoids copying:

VecNew x;
foo(reinterpret_cast<const VecOld&>(x));
Uyghur Lives Matter
  • 18,820
  • 42
  • 108
  • 144
shinjin
  • 2,858
  • 5
  • 29
  • 44
  • Are you doing the copy using `data`/`&array_name[0]` or using the name of the "array" itself? – NathanOliver Sep 07 '16 at 18:28
  • 5
    Not through `reinterpret_cast`, because of strict aliasing. – Dan Sep 07 '16 at 18:28
  • 1
    Hmm... the original question was about copying, the new question is about `reinterpret_cast`-ing. That's somewhat different... – Barry Sep 07 '16 at 21:06
  • @Barry, the original question was "Is the std::array bit compatible with the old C array?", and still is. Everything else is a consequence of that. – shinjin Sep 07 '16 at 21:10
  • 1
    It looks like you're trying to modernize legacy C++ code by replacing old constructs by new ones, right? – Laurent LA RIZZA Sep 08 '16 at 10:39
  • 1
    Then somebody changes `VecNew` by adding new field for example and enjoy debugging. No, thanks. – Slava Sep 08 '16 at 13:13

5 Answers5

20

This doesn't directly answer your question, but you should simply use std::copy:

T c[N];
std::array<T, N> cpp;

// from C to C++
std::copy(std::begin(c), std::end(c), std::begin(cpp));

// from C++ to C
std::copy(std::begin(cpp), std::end(cpp), std::begin(c));

If T is a trivially copyable type, this'll compile down to memcpy. If it's not, then this'll do element-wise copy assignment and be correct. Either way, this does the Right Thing and is quite readable. No manual byte arithmetic necessary.

Barry
  • 286,269
  • 29
  • 621
  • 977
  • 7
    nitpick: `std::copy` doesn't always compile down to `memcpy` it's an implementation detail. For example VC++ uses `memmove` for byte copies. – Mgetz Sep 07 '16 at 20:28
  • 10
    I'm torn. This is a great answer... for a different question! – underscore_d Sep 08 '16 at 00:08
  • https://godbolt.org/g/SGdWwp Looks like it is doing the fast copy only if the two arguments are of the same array type(only `test` and `test3` compile to `memmove`). – Bad_ptr Jul 21 '18 at 10:29
15

std::array provides method data() which can be used to copy to/from c-style array of proper size:

const size_t size = 123;
int carray[size];
std::array<int,size> array;

memcpy( carray, array.data(), sizeof(int) * size );
memcpy( array.data(), carray, sizeof(int) * size );

As stated on documentation

This container is an aggregate type with the same semantics as a struct holding a C-style array T[N] as its only non-static data member.

so it seems that memory footprint would be compatible with c-style array, though it is not clear why you want to use "hacks" with reinterpret_cast when there is a proper way which does not have any overhead.

Slava
  • 43,454
  • 1
  • 47
  • 90
11

I say yes (but the standard does not guarantee it).

According to [array]/2:

An array is an aggregate ([dcl.init.aggr]) that can be list-initialized with up to N elements whose types are convertible to T.

And [dcl.init.aggr]:

An aggregate is an array or a class (Clause [class]) with

  • no user-provided, explicit, or inherited constructors ([class.ctor]),

  • no private or protected non-static data members (Clause [class.access]),

  • no virtual functions ([class.virtual]), and

  • no virtual, private, or protected base classes ([class.mi]).

In light of this, "can be list-initialized" is only possible if there are no other members in the beginning of the class and no vtable.

Then, data() is specified as:

constexpr T* data() noexcept;

Returns: A pointer such that [data(), data() + size()) is a valid range, and data() == addressof(front()).

The standard basically wants to say "it returns an array" but leaves the door open for other implementations.

The only possible other implementation is a structure with individual elements, in which case you can run into aliasing problems. But in my view this approach does not add anything but complexity. There is nothing to gain by unrolling an array into a struct.

So it makes no sense not to implement std::array as an array.

But a loophole does exist.

rustyx
  • 80,671
  • 25
  • 200
  • 267
  • I disagree that aliasing problems could occur. What is your reasoning for that claim? – Brian Bi Sep 07 '16 at 21:50
  • A structure and an array are incompatible types in terms of strict aliasing. – rustyx Sep 07 '16 at 22:03
  • I don't think your interpretation of the strict aliasing rule is correct. If that were the case, then an array type would also be incompatible with its element type, which is clearly absurd. – Brian Bi Sep 07 '16 at 22:06
  • His assertion on strict aliasing does not imply what you claim it does. – alexchandel Sep 07 '16 at 22:15
  • 1
    @Brian That's not what RustyX was saying. An array has never been compatible with a `struct` having the same number of same-typed members. However, even your tangential inference about the compatibility of pointers to arrays versus pointers to their elements is soon to be all too true! See ecatmur's answer regarding the fun in store from the in-drafting P0137R1. And please, if you are so inclined and in a position to, file a National Body comment expressing scepticism of it. – underscore_d Sep 08 '16 at 00:01
9

The requirement on the data() method is that it return a pointer T* such that:

[data(), data() + size()) is a valid range, and data() == addressof(front()).

This implies that you can access each element sequentially via the data() pointer, and so if T is trivially copyable you can indeed use memcpy to copy sizeof(T) * size() bytes to/from an array T[size()], since this is equivalent to memcpying each element individually.

However, you cannot use reinterpret_cast, since that would violate strict aliasing, as data() is not required to actually be backed by an array - and also, even if you were to guarantee that std::array contains an array, since C++17 you cannot (even using reinterpret_cast) cast a pointer to an array to/from a pointer to its first member (you have to use std::launder).

ecatmur
  • 152,476
  • 27
  • 293
  • 366
  • "as data() is not required to actually be backed by an array", uhm, that's nonsense. – Cheers and hth. - Alf Sep 07 '16 at 19:06
  • 5
    Re " since C++17 you cannot (even using reinterpret_cast) cast a pointer to an array to/from a pointer to its first member (you have to use std::launder)", that sounds interesting: the committee going lunatically berserk! More info please. In the meantime I'll make some popcorn. – Cheers and hth. - Alf Sep 07 '16 at 19:07
  • @Cheersandhth.-Alf "not required to actually be backed by an array" - trivially true for N=0 and N=1; implementable via vendor-library cooperation for N>1. – ecatmur Sep 07 '16 at 19:15
  • 2
    @Cheersandhth.-Alf "a pointer to an array cannot be cast to/from a pointer to its first element": see http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0137r1.html – ecatmur Sep 07 '16 at 19:17
  • Not the N=1; a variable is an array of size 1. – Cheers and hth. - Alf Sep 07 '16 at 19:17
  • @Cheersandhth.-Alf "for purposes of pointer arithmetic and comparison"; not for all purposes and certainly not as far as aliasing goes. – ecatmur Sep 07 '16 at 19:19
  • 3
    Uhm, re the link, that's a wall of text. Several hundred kilometers of it. Can you sort of vaguely indicate a region of, say, less than 203 meters? – Cheers and hth. - Alf Sep 07 '16 at 19:20
  • re the "certainly not", that's nonsense. – Cheers and hth. - Alf Sep 07 '16 at 19:22
  • 2
    It seems this is about giving a compiler vendor a controlling interest in the management of the standard, whence, that compiler's shortcomings and silly behaviors become standardized. Oh well. – Cheers and hth. - Alf Sep 07 '16 at 19:24
  • 1
    @Cheersandhth.-Alf the edits and additions to [basic.compound]/3 are the really fun bit. Note that I linked the wrong version initially; you should be reading P0137r1. – ecatmur Sep 07 '16 at 19:24
  • 2
    @Cheersandhth.-Alf nonsense maybe, but that's the direction the Standard is going: `int a; int (&r)[1] = reinterpret_cast(a); r[0] = 42;` is now UB. Truly we live in interesting times. – ecatmur Sep 07 '16 at 19:27
  • 2
    Do you know if the persons doing this work have given any rationale for the, uh, changes? Like what they hope to achieve? – Cheers and hth. - Alf Sep 07 '16 at 19:29
  • 1
    @Cheersandhth.-Alf to my understanding they're formalising what compilers are already doing while making it possible to implement containers that create and destroy objects (esp. vector and optional). The intent is presumably to be able to squeeze every last drop of UB out of the aliasing rules to perform better on SPEC (ahem: on user code). – ecatmur Sep 07 '16 at 19:41
  • 1
    Why on Earth did the authors, right after specifically allowing for conversion between pointers to a standard-layout struct vs its 1st member, then tack on - _in a friggin' Note_, as if a passing afterthought - that the same allowance does not hold for an array and its 1st element? Is there a nuance I'm missing where one should be allowed but the second not? Surely `( (NonexistentStruct*)somePointer )->nonexistentMember` is not any less dangerous than `( (NonexistentArray[42]*)somePointer )[23]`? ...or whatever the syntax; I don't do this stuff, ofc. What's the crucial distinction I'm missing? – underscore_d Sep 08 '16 at 00:22
  • I bet a lot of libboost's hacks are gonna break now. – paulotorrens Sep 08 '16 at 08:18
  • 4
    @underscore_d it's not about danger, it's about optimization; a lot of scientific code (*cough* SPEC *cough*) can be effectively accelerated if the compiler assumes that different-sized arrays and pointers don't alias even when the element type is the same. The speed-ups this yields are considered (by compiler authors, and, being fair, to their customers writing scientific, Fortran-style code) to be worth the potential confusion and breakage to their users writing more systems- or object-oriented code. – ecatmur Sep 08 '16 at 11:03
  • @ecatmur Just `optional`. `vector` is still formally broken. In particular, `vector::data` is unimplementable in standard C++. – T.C. Sep 08 '16 at 21:04
  • @T.C. Can you link to a discussion of the problem with `vector`? – underscore_d Sep 10 '16 at 15:33
7

array doesn't mandate much about the underlying type over which you instantiate it.

To have any possibility of useful results from using memcpy or reinterpret_cast to do a copy, the type you instantiated it over would have to be trivially copyable. Storing those items in an array doesn't affect the requirement that memcpy (and such) only work with trivially copyable types.

array is required to be a contiguous container and an aggregate, which pretty much means that the storage for the elements must be an array. The standard shows it as:

T elems[N]; // exposition only

It later, however, has a note that at least implies that it's being an array is required (§[array.overview]/4):

[Note: The member variable elems is shown for exposition only, to emphasize that array is a class aggregate. The name elems is not part of array’s interface. —end note]

[emphasis added]

Note how it's really only the specific name elems that isn't required.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • 2
    The [new draft](http://eel.is/c++draft/array) got rid of that part. Now we just have that it's an aggregate that can be list initialized with `N` `T`s (but +1). – Barry Sep 07 '16 at 18:45
  • @Barry: I'm not at all sure that really changes much. Offhand, I don't see a way of meeting its requirements (contiguous container, aggregate) except by having only one data member, which is an array. I suppose if you could assure against padding between elements, you *could* create a variadic template of discrete elements, but only because the elements would still be addressable like an array. – Jerry Coffin Sep 07 '16 at 19:01
  • The initialization couldn't work if `array` wasn't a simple `struct` wrapper of a raw array. – Cheers and hth. - Alf Sep 07 '16 at 19:04
  • 1
    @JerryCoffin Oh I'm not saying `std::array` isn't definitely a wrapper around a raw array. I'm just saying that now the wording around that description is totally different (not sure what the significance of that chance is tbh. just pointing it out). – Barry Sep 07 '16 at 19:09
  • The initialization (but not other parts)could work if the storage were discrete members in the correct order. – Jerry Coffin Sep 07 '16 at 19:20
  • I mean, the `data` functionality only guarantees that there is effectively an array in there. The initialization means that there's nothing before that array. So there is an array, and there's nothing before it, and so as the first member of a struct a pointer to it can be reinterpreted as pointer to the struct and vice versa (this is at the end of the section about class members). – Cheers and hth. - Alf Sep 07 '16 at 21:24
  • @JerryCoffin "(but not other parts)" yeah, like basic indexing and pointer arithmetic! – underscore_d Sep 08 '16 at 00:06