8

In this I need C++ array class template, which is fixed-size, stack-based and doesn't require default constructor answer I posted a piece of code, that is using placement new with char array. For me, this is something absolutely normal. But according to comments this code is wrong.

Can anyone explain in more detail?

Specifically what can go wrong with the array. What I understand from the comments is that T x[size]; might not fit into char x[size*sizeof(T)];. I don't believe this is true.

EDIT:

I'm just more and more confused. I know what alignment is in case of structures. Yes, when you have a structure the attributes start on different offsets then you might think.

OK, now we are back to arrays. You are telling me that T x[size]; is the same size as char x[size*sizeof(T)];, yet I cannot access the char array as T array because there might be some alignment. How can there be alignment when the arrays have the same size?

EDIT 2:

OK I finally get it, it may start on a wrong address.

EDIT 3:

Thx everyone, you can stop posting :-) Phew, this total blew my mind. I just never realized this was possible.

Community
  • 1
  • 1
Šimon Tóth
  • 35,456
  • 20
  • 106
  • 151
  • 1
    You misunderstood the comments. It *fits*, but the char-array might be mis-aligned. – Steve Jessop Oct 06 '10 at 16:19
  • @Steve But what the hell does that mean, please explain. Like the char array will be indexed in inverse order or what. Give me some example of what can happen. I only understand alignment in context of structures and in context of little/big-endian. – Šimon Tóth Oct 06 '10 at 16:22
  • 1
    @Let_Me_Be: take a look here http://en.wikipedia.org/wiki/Segmentation_fault#Bus_error for an explanation. – Eugen Constantin Dinca Oct 06 '10 at 16:25
  • Like in all things C++ prefer the std::vector over an array. Because the memory is dynamically allocated it is guaranteed to be correctly aligned. – Martin York Oct 06 '10 at 18:11
  • @Martin Yeah vector is nice when you can use it. But creating an array of objects that don't have default constructor just jumped into impossible area for me. – Šimon Tóth Oct 06 '10 at 20:47
  • What you mean like std::vector ? I mean you use a vector instead of an array of char (as in your example). Then the allocated area is guranteed to be correctly aligned for use with placement new. – Martin York Oct 07 '10 at 03:19
  • @Martin: the background you may be missing is that Let_Me_Be was responding to a question which asked for the data to be "stack-based", which I interpreted to mean that the questioner is using an implementation that has a stack, and wants the data to be on the stack. If that's never a legitimate requirement then you would have to take it up with user467799, since Let_Me_Be can't account for a restriction that someone else invented. A decent, stack-based allocator for `vector` would solve the problem, but ultimately falls back to this same issue of alignment of automatic variables. – Steve Jessop Oct 07 '10 at 11:14

5 Answers5

13

A T x[size] array will always fit exactly into size * sizeof(T) bytes, meaning that char buffer[size*sizeof(T)] is always precisely enough to store such an array.

The problem in that answer, as I understood it, was that your char array is not guaranteed to be properly aligned for storing the object of type T. Only malloc-ed/new-ed buffers are guaranteed to be aligned properly to store any standard data type of smaller or equal size (or data type composed of standard data types), but if you just explicitly declare a char array (as a local object or member subobject), there's no such guarantee.

Alignment means that on some platform it might be strictly (or not so strictly) required to allocate, say, all int objects on, say, a 4-byte boundary. E.g. you can place an int object at the address 0x1000 or 0x1004, but you cannot place an int object at the address0x1001. Or, more precisely, you can, but any attempts to access this memory location as an object of type int will result in a crash.

When you create an arbitrary char array, the compiler does not know what you are planning to use it for. It can decide to place that array at the address 0x1001. For the above reason, a naive attempt to create an int array in such an unaligned buffer will fail.

The alignment requirements on some platform are strict, meaning that any attempts to work with misaligned data will result in run-time failure. On some other platforms they are less strict: the code will work, but the performance will suffer.

The need for the proper alignment sometimes means that when you want to create an int array in an arbitrary char array, you might have to shift the beginning of an int array forward from the beginning of the char array. For example, if the char array resides at 0x1001, you have no choice but to start your constructed-in-place int array from the address 0x1004 (which is the char element with the index 3). In order to accommodate the tail portion of the shifted int array, the char array would have to be 3 bytes larger than what the size * sizeof(T) evaluates to. This is why the original size might not be enough.

Generally, if your char array is not aligned in any way, you will really need an array of size * sizeof(T) + A - 1 bytes to accommodate an aligned (i.e. possibly shifted) array of objects of type T that must be aligned at A-byte boundary.

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
  • "char array is not guaranteed to be properly aligned for storing the object of type T" what does this mean? – Šimon Tóth Oct 06 '10 at 16:23
  • @Let_Me_Be: It refers to the location of the array in memory. The array has a starting address which may not be a valid starting address to store some other object in it via placement new. For example, if you try to construct an int in there and your machine doesn't support storing ints at odd addresses, you're screwed in case your array starts at an odd address. – sellibitze Oct 06 '10 at 16:32
  • @sellibitze Ooooh, finally that makes sense. So basically alignment is added before the variable not after? – Šimon Tóth Oct 06 '10 at 16:37
  • @Let_Me_Be: "Before" and "after" are relative terms. Padding bytes are added for the purpose of alignment *before* the object being aligned, but *after* the previous object. – AnT stands with Russia Oct 06 '10 at 16:47
  • @AndreyT yes, that was what I meant :-) – Šimon Tóth Oct 06 '10 at 16:49
  • 1
    @Let_Me_Be: I think AndreyT's answer should answer all that. In addition, let me point out that the upcoming C++ standard will offer tools to deal with this problem (see `std::aligned_storage` and the `sizeof`-, `alignof`- operators) – sellibitze Oct 06 '10 at 16:53
  • However, OP isn't asking about an arbitrary char array, but about one that is the first member of a class (also containing a `size_t`). Doesn't the class and therefore also its first member have to be aligned at least to a 4-byte boundary? – UncleBens Oct 06 '10 at 19:33
  • @UncleBens: Yes, in such specific cases the array may end up properly aligned. I was talking about more general case. – AnT stands with Russia Oct 06 '10 at 22:10
  • @UncleBens: However 4-byte boundary may not be sufficient, if you require a type that is 8-aligned. – Matthieu M. Oct 07 '10 at 06:13
0

T may be aligned different from a char.

also, itanium abi (for example) specifies cookies for non-pod array, so it knows how many elements to walk across at deletion (to call the destructors). the allocation via new is like this (iirc):

size_t elementCount;
// padding to get native alignment for 1st element
T elements[elementCount];

so the allocation for a 16 byte aligned object is:

size_t elementCount; // 4
char padding[16 - sizeof(elementCount)];
T elements[elementCount]; // naturally aligned

char can be aligned to 1 on some systems so... you see where the misalignment and size issues fit in. builtin types don't need their dtors called, but everything else does.

justin
  • 104,054
  • 14
  • 179
  • 226
  • "T may be aligned different from a char." what does this mean? – Šimon Tóth Oct 06 '10 at 16:25
  • @Let_Me_Be the alignment for a type is determined by the compiler, based on the object's size and contents - it's generally a well hidden implementation detail for the platform you're targeting. it means that all objects are created on this byte boundary. for a char, (although implementation defined) it may (hypothetically) be placed on a 1 byte boundary. objects/classes typically have larger alignment values, such as 4 - so that means that a `T` on the stack may waste 3 bytes if a char precedes it (e.g. in a function). the compiler assumes all arguments are passed/created at natural... – justin Oct 06 '10 at 16:42
  • (continud...) alignment. if it an object is not passed at natural alignment then the assumptions for addresses the compiler made will cause your program to operate in unusual ways because your program will be reading and writing from an address which may be a few bytes off. – justin Oct 06 '10 at 16:46
0

§5.3.4/10:

A new-expression passes the amount of space requested to the allocation function as the first argument of type std::size_t. That argument shall be no less than the size of the object being created; it may be greater than the size of the object being created only if the object is an array. For arrays of char and unsigned char, the difference between the result of the new-expression and the address returned by the allocation function shall be an integral multiple of the most stringent alignment requirement (3.9) of any object type whose size is no greater than the size of the array being created.

This allows using char arrays allocated with new for placement-construction of appropriately sized objects of other types. The pre-allocated buffer has to be allocated on the heap. Otherwise you might run into alignment problems.

0

On some systems, memory access must be "aligned". For the sake of simplicity, this means that the address must be a multiple of some integer called the "alignemnt requirement" of the type (see 3.9/5 of the C++ standard).

So, for example, suppose sizeof(int) == 4:

int *intarray = new int[2];        // 8 bytes
char *charptr = (char *)intarray;  // legal reinterpret_cast
charptr += 1;                      // still 7 bytes available
*((int*)charptr) = 1;              // BAD!

The address of charptr is not a multiple of 4, so if int has an alignment requirement of 4 on your platform, then the program has undefined behavior.

Similarly:

char ra[8];
int *intptr = reinterpret_cast<int*>(ra);
intptr[0] = 1;  // BAD!

The address of ra is not guaranteed to be a multiple of 4.

This is OK, though:

char ra = new char[8];
int *intptr = reinterpret_cast<int*>(ra);
intptr[0] = 1;  // NOT BAD!

because new guarantees that char array allocations are aligned for any type that is small enough to fit in the allocation (5.3.4/10).

Moving away from automatics, it's easy to see why the compiler is free not to align data members. Consider:

struct foo {
    char first[1];
    char second[8];
    char third[3];
};

If the standard guaranteed that second was 4-aligned (still assuming that int is 4-aligned), then the size of this struct would have to be at least 16 (and its alignment requirement at least 4). As the standard is actually written, a compiler is permitted to give this struct size 12, with no padding and no alignment requirement.

Steve Jessop
  • 273,490
  • 39
  • 460
  • 699
  • I totaly get the first one, but I have no idea why the second one is bad. – Šimon Tóth Oct 06 '10 at 16:31
  • Well, in the first example `charptr` is guaranteed not to be a multiple of 4, I made sure of that by adding 1 to an aligned value. In the second example, `ra` might be at an address that is a multiple of 4, but then again it might not. The standard doesn't care either way. – Steve Jessop Oct 06 '10 at 16:33
  • Yeah, what I did not realize is that first[1] can start on an non-aligned address. I didn't think that was possible. – Šimon Tóth Oct 06 '10 at 16:46
  • @Let_Me_Be: as I said over in the other question, I'm not aware of anything in the standard which forbids it. Which doesn't necessarily mean there isn't anything, but until someone produces it I'm sticking to my story :-) – Steve Jessop Oct 06 '10 at 16:54
0

char x[size*sizeof(T)]; might not take alignment into account, where as T x[size]; will. alignment(2) can also be very important when working with SSE types, that require 16 byte alignment

Necrolis
  • 25,836
  • 3
  • 63
  • 101