std::array alignment

Question

Trying out std::tr1::array on a mac i'm getting 16 byte alignment.

sizeof(int) = 4;  
sizeof( std::tr1::array< int,3 > ) = 16;  
sizeof( std::tr1::array< int,4 > ) = 16;    
sizeof( std::tr1::array< int,5 > ) = 32;

Is there anything in the STL that behaves like array< T,N > but is guaranteed to NOT pad itself out, i.e.

sizeof( ARRAY< T, N> ) = sizeof(  T )*N

Yes, it's built into the language: `T a[N]; static_assert(sizeof(a) == sizeof(T)*N);`. — Nikolai Fetissov, Nov 25 '11 at 04:22
haha ok of course. What I meant is a wrapped collection with additional methods, e.g. suitable to slot into range based for or to take per-element assignment/conversion operators. This is coming from mac osx 10.7, target = 64bit intel, LLVM 3.0. Their implementation of std::tr1::array declares __atttribute(__aligned__ ) which is probably why :( of course one wants __aligned__ to do it's job, just not to be the default — centaurian_slug, Nov 26 '11 at 00:11
@centaurian_slug: no, everything is and should be aligned by default. The question is why/whether this particular class has stricter alignment than it needs — jalf, Nov 26 '11 at 13:22
-Sorry if my last line sounded imprecise..Yes I understand that most CPU's prefer types to have their _own_ alignment,hence that is default, I do expect that. But attribute( __aligned__ ) to me signifies selecting some 'extra' alignment _beyond_ the default (the motivation usually being for wide SIMD load/stores or cache lines) .. definitely unexpected behavior. — centaurian_slug, Nov 26 '11 at 20:06
@centaurian_slug Yeah, here: http://gcc.gnu.org/onlinedocs/gcc-7.1.0/gcc/Common-Type-Attributes.html#Common-Type-Attributes _When you leave out the alignment factor in an aligned attribute specification, the compiler automatically sets the alignment for the type to the largest alignment which is ever used for any data type on the target machine you are compiling for. Doing this can often make copy operations more efficient, because the compiler can use whatever instructions copy the biggest chunks of memory when performing copies to or from variables which have types you have aligned this way_ — underscore_d, May 26 '17 at 22:01
@centaurian_slug FWIW, as of now, `libstdc++` does **not** put `attribute(aligned)` on `std::array`. — underscore_d, May 26 '17 at 22:06

score 2 · Answer 1 · edited Jun 17 '22 at 07:54

2

The standard mandates that the elements "are stored contiguously, meaning that if a is an array<T, N>, then it obeys the identity &a[n] == &a[0] + n for all 0 <= n < N." (23.3.2.1 [array.overview] paragraph 1)

As far as I know, there is no guarantee that sizeof (std::array<T, N>) == sizeof (T) * N, but the contiguity statement asserts that the values are stored just like in a regular C array. If you only have one array of values that need to be contiguous, you can use sizeof (T) * std::array<T, N>::size() as the size and std::array<T, N>::data() as the starting address of the array.

edited Jun 17 '22 at 07:54

Jarod42

203,559
14
181
302

answered Nov 26 '11 at 12:48

Ernavar

21
1

I guess I can warn myself with a static_assert on this for portability. – centaurian_slug Nov 26 '11 at 20:12

score 1 · Answer 2 · answered May 23 '13 at 15:37

1

std::array is actually not an array, but a struct that contains an array. It's the struct that is padded, not the array. Compilers are allowed to add padding to the end of a struct whenever they want.

answered May 23 '13 at 15:37

score 0 · Answer 3 · answered Nov 25 '11 at 06:47

0

It looks from what little data you've given like it allocates memory to the nearest power of two. Knowing very little CPU architecture details, I might guess that allocating power-of-two sizes is faster than non padded, at least for small amounts. Perhaps you should see what happens when you try to allocate something a much larger?

Is there any reason you absolutely positively need to skim those extra bytes off the top?

answered Nov 25 '11 at 06:47

semisight

914
1
8
15

>>"Is there any reason you absolutely positively need to skim those extra bytes off the top?" I usually make my own "template struct FixedArray .. " but was looking for the "standard" ways of doing things. Zero padding is an absolutely necessity. (e.g., vertex data). – centaurian_slug Nov 25 '11 at 23:49
1

I've just tried using it composed in another class, and sure the individual components get individual alignment and padding too. Of course thats' the expected behavior with __attribute(__aligned__) in the library header. – centaurian_slug Nov 26 '11 at 00:25

score 0 · Answer 4 · edited Nov 26 '11 at 13:23

0

Since posting, turns I get what I want swapping the IDE setting from the default to..

LLVM compiler 3.0 Language:
  LLVM C++ standard library:
    =libc++ (LLVM standard library with c++0x support.)

( CLANG_CXX_LIBRARY = libc++ )l

Previously the setting was "libstdc++ (gcc c++ standard library)" which appears to have the padding, and that allowed me to include <array> instead of <tr1/array>; and now

sizeof(array<T,N>)==sizeof(T)*N

this is all in Xcode 4.2 on mac osx lion. I'm hoping one is simply deprecated and that this behavior is what i'll get on other platforms?

edited Nov 26 '11 at 13:23

jalf

243,077
51
345
550

answered Nov 26 '11 at 06:27

centaurian_slug

3,129
1
17
16

1

"I'm hoping one is simply deprecated and that this behavior is what i'll get on other platforms?" No, it's not. Compilers are given latitude to align most objects flexibly. They could even have different alignment between debug and release. I don't know why you need such specific alignment requirements. – Nicol Bolas Nov 27 '11 at 00:41
It's for 3d rendering vertex data - definitely a *size* critical case. The problem is the *extra* alignment requested deliberately in the tr1 header, over and above compiler default. I would assume compiler default would most frequently be most compact data structure, with alignment per type to it's own size. I can poke the data into raw memory byte by byte if need be, I have my own template, but if there's something in the STL which does the same job then I can simplify my source, make it readily understandable by others and more likely to interoperate with other libraries, which is a win. – centaurian_slug Nov 27 '11 at 01:33
1

For buffers of vertex data, it's almost always better to handle them as raw memory. Not just because you'll be in control, but because you *want* that control. With that control, you can play tricks like using 3-float positions with 4-byte colors, all packed into 16-bytes-per-vertex. Or using that combined with normals in 10/10/10-bit format, and texture coordinates that are in 16-bit shorts, for a total of 24-bytes per vertex. – Nicol Bolas Nov 27 '11 at 01:38
I have done all the raw memory tricks, then created templates to make them more readable.. automate conversions to and from packed formats and intermediate setup calculations expanded as full floats. I've have created a fixed point class can instantiate in 8bits for colors or 16 for vertices; and a 'half' float class. As the number of permutations increases, the versatility of templates is very welcome; raw #defines and helper functions become unwieldy to name. Where speed doesn't matter, I like the most compact source :) – centaurian_slug Nov 27 '11 at 02:18
I could do something completely data-driven, dodging the class issue completely, but it's nice to have the most natural C++ representation any time I want to manipulate vertex data within the language (translating formats, decompressing, procedural mesh generation..) – centaurian_slug Nov 27 '11 at 02:46

std::array alignment

4 Answers4

Linked