Can sizeof of struct be affected by array metadata?

Question

There has been a number of questions regarding the sizeof of struct's (not) being equal to the sum of sizeof's of its elements. Usually this is due to data alignment. This question is not concerned with data alignment, so please suppose that sizes of all types are multiples of the alignment (say 4B).

As explained here, allocating an array will result in some metadata beeing stored about the size of the allocated array. Let's say we had the following code:

const int size = 10;

struct X {
    int someInt;
    int array[size];
};

struct Y {
    int someInt;
    T array[size];
};

Since the size is known at compile-time, the compiler should be smart enough to determine that there is no need to store any metadata in the case of X. The compiler could be smart enough to follow this reasoning even in the case of Y (there could be difference between C and C++ here, since in C++ there is the additional requirement of calling destructors for individual instances of T).

My question is: am I guaranteed that sizeof(X) == (size + 1) * sizeof(int) or is it compiler-specific? Or more generally, is sizeof(Y) == size * sizeof(T) + sizeof(int)?

EDIT: To hopefully clarify things a bit: the question is about both C and C++. Also the original motivation for asking this question is this. If I run

X *foo = new X[100];

or it's C equivalent somewhere in code, will it create a continuous block of memory of size 100 * (size + 1) * sizeof(int)?

You aren't "allocating" any arrays in a way that would necessitate any meta-data. — juanchopanza, Apr 04 '17 at 19:55
There is no *metadata*. These are arrays and not std::vectors. — Ajay Brahmakshatriya, Apr 04 '17 at 19:56
Your link explanation is for dynamic allocation, there is no metadata in your case. — Tony J, Apr 04 '17 at 19:59
The text in your link is talking about dynamic allocation with `new []`. — molbdnilo, Apr 04 '17 at 20:00
C: Tossing aside alignment/packing issues. `sizeof(Y) == size * sizeof(T) + sizeof(int)` is true. If some form of _mettadata_ existed, it would not contribute to the `sizeof` nor would it exist between `someInt` and `array`. Yet _why_ is this important? What is code attempting that relies on this feature? — chux - Reinstate Monica, Apr 04 '17 at 20:01
Arrays are constructs with constant size fixed at compile time - why would there be a need for any metadata? Sure, during compilation the *compiler* probably keeps some metadata around, but in the final binary it's just a chunk of memory with no metadata needed since the compiler will have already resolved any accesses into it into fixed values. — Jesper Juhl, Apr 04 '17 at 20:14
@AjayBrahmakshatriya C implementation commonly uses metadata to manage the memory space. Lots of object files embed metadata for debugging purposes. C specifies more of what is and not what is not. The point is that the meta data is not available in standard C code. C does specify the contiguousness of objects which seems to be OP's concern. IOW, does OP's yet to be stated need for contiguousness, get messed up by potential metadata? — chux - Reinstate Monica, Apr 04 '17 at 20:19
@chux I see, OP means the implementation metadata and not something in the language. That as you mentioned won't interfere with the struct packing since C enforces how structs need to be packed. Also most implementation metadata (like md for heap allocated chunks) would always be out of bounds of the accessible memory and in general should not interfere with any data structures, be it structs or arrays or anything else. — Ajay Brahmakshatriya, Apr 04 '17 at 20:24
@AjayBrahmakshatriya Agreed. Access to some MD like [`malloc_usable_size`](https://linux.die.net/man/3/malloc_usable_size) is certainly implementation defined. — chux - Reinstate Monica, Apr 04 '17 at 20:28
@chux I think I was incorrect in saying *C enforces how structs need to be packed*. I recall that is implementation defined. So in theory it is possible that there is some field md is stored. But in practice I am sure no compiler would do that. — Ajay Brahmakshatriya, Apr 04 '17 at 20:30
`X *foo = new X[100];` might/might not create "a continuous block of memory of size `100 * (size + 1) * sizeof(int)`". Very likely yes, but it depends on things. Knowing why this is important to code may help. — chux - Reinstate Monica, Apr 05 '17 at 16:18
@chux Basically I have an array of `int`s that can be thought to be partitioned into smaller chunks. Something like treating array of `m * n` elements as representation of a `m x n` matrix. What I wanted to know is whether I can create an "alias" for these chunks and allocate an array of those without loosing control over memory (alignment, ...). I asked more out of curiosity, to know C, C++ better. Nothing serious depends on this. — insert_name_here, Apr 06 '17 at 07:42
@insert_name_here How does knowing if `int someInt` and `int array[size];` are packed next to each other relate to your [`m*n` goal](http://stackoverflow.com/questions/43216468/can-sizeof-of-struct-be-affected-by-array-metadata?noredirect=1#comment73566679_43216468)? — chux - Reinstate Monica, Apr 06 '17 at 13:37
@chux If they weren't (there was some "metadata" containing size of `array` between them or whatever), then there wouldn't be "just consecutive of ints" in memory, but rather that plus some stuff in between. Therefore the allocated memory wouldn't be **identical** and it wouldn't be just an "*alias*" for the same underlying structure. If so, you can't use this approach (and save yourself from calculating a bunch of offsets) without (possibly) changing how the data is stored in memory. — insert_name_here, Apr 06 '17 at 14:23
@insert_name_here Sounds like you agree with the limitation of the posted code as commented [here](http://stackoverflow.com/questions/43216468/can-sizeof-of-struct-be-affected-by-array-metadata?noredirect=1#comment73543243_43216468). The "... save yourself from calculating a bunch of offsets" is amiss as it supposes that the alternative to the posted approach is tedious. As with many problems. clearly identifying the coding goals, rather than assessing the pros/cons of a single candidate approach, is useful at getting a better answer. — chux - Reinstate Monica, Apr 06 '17 at 14:34

Petr Skocik · Accepted Answer · 2017-04-04T20:32:47.480

3

C arrays in common implementations don't store any metadata around them, however, padding may be added to structs so that a_struct_ptr + 1 has sufficient alignment for a_struct.

In the case of the first struct ({ int someInt; int array[size]; }), no padding should be required, so

sizeof(X) == (size + 1) * sizeof(int)

should hold (though, I don't think compilers are obligated to guarantee it).

In the case of the second struct, the alignment requirements of T and int may cause padding to be added to the struct, which would invalidate your equation.

edited Apr 04 '17 at 20:32

answered Apr 04 '17 at 20:03

Petr Skocik

58,047
6
95
142

2

"C arrays don't store any metadata" Hmmm. Certainly C does not specify any metadata. A compiler could store meta-data in some fashion though with each object. The issue is: is such added data allowed to be between members of a `struct`? IMO: no. – chux - Reinstate Monica Apr 04 '17 at 20:08
1

I think that's an empirical observation that compilers "don't store metadata". If they did, it would be associated with the type rather than variables of the type — a bit like v-tables in C++. At most, there might be a pointer to somewhere in the tail of the structure in compiler-applied padding. But it is pretty unlikely to be an actual problem. Yes, in theory, a compiler could add information into a structure. No, in practice, compilers don't do it because there's no need/benefit. – Jonathan Leffler Apr 04 '17 at 20:17
Padding can very well be added for e.g 64 bit alignment or even wider cache-line alignment. – too honest for this site Apr 04 '17 at 20:21
@JonathanLeffler Hmm, I suppose if a compiler wanted to add some MD between structure members, it could make `max_align_t` very wide and hide it in padding, except for adjacent `max_align_t` members. Still, unclear why this is important to OP. – chux - Reinstate Monica Apr 04 '17 at 20:24
I have a feeling, OP was referring to metadata which is part of the language. This is because he writes that it should not be there because size of the array is known. And also understands that some things can be compiler specific (last line of the question). And the fact that he asks if this is compile specific clearly implies he is not talking about implementation specific md. I think we are reading too much into the question. – Ajay Brahmakshatriya Apr 04 '17 at 20:37
@AjayBrahmakshatriya I suspect you are correct, yet OP has been silent since posting about 1 hour ago. – chux - Reinstate Monica Apr 04 '17 at 20:47
@chux I have edited the question to try to clarify things. Sorry for not communicating, this is my first SO question and I always lived under the impression that response time is in span of days, not minutes. As noted [here](http://stackoverflow.com/questions/3575458/does-new-call-default-constructor-in-c), array initialization in C++ calls constructor for non-primitive types. Therefore I would *assume*, that there has to be some "metadata" to indicate the number of times destructor is called for array members when leaving the scope. – insert_name_here Apr 05 '17 at 06:02
@insert_name_here If a type needs to keep track of the number of times a destructor is called, the type needs record that as its own data. Code will only call the destructor once, by C++ design, unless code has additional explicit destructor calls. – chux - Reinstate Monica Apr 05 '17 at 12:41
@chux can I interpret it in the way that compiler will provide destructor `~Y()` where it will call destructors in a for loop for every instance of `T` inside of `Y`? – insert_name_here Apr 05 '17 at 13:05
@PSkocik thank you. Could you possibly include some thoughts on the editted part of the question in your answer or in here? – insert_name_here Apr 05 '17 at 13:21
@insert_name_here Arrays are guaranteed to be contiguous. `new X[100];` will allocate a contiguous block of X, size = sizeof(X)*100, initialized with the X::X() ctor, if it exists. The dtors (if any) will be run upon `delete []foo;`. – Petr Skocik Apr 05 '17 at 13:27
@PSkocik So if I make an array of `100 * sizeof(int) * (size + 1)` ints or `100` `X`s, the allocated memory will be completely identical? (Sorry, last question.) – insert_name_here Apr 05 '17 at 13:34
As I said in my answer, it's very likely that `struct X` will be sized `sizeof(int)*(size+1)`. Sane compilers won't pad it. It's conceivable, though, that an insane and still conformant compiler would pad it. I don't think the standard prohibits compilers from adding arbitrary padding. – Petr Skocik Apr 05 '17 at 13:40

score 0 · Answer 2 · edited Jun 20 '20 at 09:12

Meta data can hide between struct members. I have never see this use for meta-data, yet a compiler could play with the alignments of types, up to a point, to provide a discontinuous memory between two int members by obliging the alignment of an int to be greater than an int width such as 4 bytes. If the compiler uses this for meta-data, performance padding or spite, it is irrelevant. The point is that it may exist.

C99/C11 provides max_align_t, so types will not natively exceed that alignment per the C spec.

A fundamental alignment is represented by an alignment less than or equal to the greatest alignment supported by the implementation in all contexts, which is equal to _Alignof (max_align_t). §6.2.811dr §6.2.8 2

max_align_t which is an object type whose alignment is as great as is supported by the implementation in all contexts; §7.19 2

So any structure member that is smaller than max_align_t is subject to padding/meta data in a potential context.

Can sizeof of struct be affected by array metadata?

2 Answers2