Variable Length Array overhead in C++?

Question

Looking at this question: Why does a C/C++ compiler need know the size of an array at compile time ? it came to me that compiler implementers should have had some times to get their feet wet now (it's part of C99 standard, that's 10 years ago) and provide efficient implementations.

However it still seems (from the answers) to be considered costly.

This somehow surprises me.

Of course, I understand that a static offset is much better than a dynamic one in terms of performance, and unlike one suggestion I would not actually have the compiler perform a heap allocation of the array since this would probably cost even more [this has not been measured ;)]

But I am still surprised at the supposed cost:

if there is no VLA in a function, then there would not be any cost, as far I can see.
if there is one single VLA, then one can either put it before or after all the variables, and therefore get a static offset for most of the stack frame (or so it seems to me, but I am not well-versed in stack management)

The question arise of multiple VLAs of course, and I was wondering if having a dedicated VLA stack would work. This means than a VLA would be represented by a count and a pointer (of known sizes therefore) and the actual memory taken in an secondary stack only used for this purpose (and thus really a stack too).

[rephrasing]

How VLAs are implemented in gcc / VC++ ?

Is the cost really that impressive ?

[end rephrasing]

It seems to me it can only be better than using, say, a vector, even with present implementations, since you do not incur the cost of a dynamic allocation (at the cost of not being resizable).

EDIT:

There is a partial response here, however comparing VLAs to traditional arrays seem unfair. If we knew the size beforehand, then we would not need a VLA. In the same question AndreyT gave some pointers regarding the implementation, but it's not as precise as I would like.

@Matthieu: you're thinking seems sound to me... VLA only suggests an overhead when there are more than 1 (simply by putting it "after" known-size elements, and then there could be a pointer or extra adjustment in the known-offset stack to indicate where subsequent VLAs start. Can't see a second stack helping at all though. — Tony Delroy, Dec 03 '10 at 09:19
@Tony: I was wondering how is the stack implement, if the implementation means that only the current top of the stack is known, then you have a dynamic offset computation, it seems, unless you use a second stack to store the VLA elements. If you know both the top and bottom of the current frame, then for the single VLA element it's easy. Anyway I'd just like to know how it is done (currently) and what is the "cost". — Matthieu M., Dec 03 '10 at 11:51
Asking "why feature A instead of feature B?" in the design of a particular language is not normally useful. The answers are usually subjective and uninformed. A question about how VLAs are implemented in C would be objective, and could be satisfactorily answered. — David Thornley, Dec 03 '10 at 16:59
@David: Quoting myself "I'd really like to know how VLAs are implemented in gcc / VC++, and if the cost is really that impressive.", sorry it wasn't obvious I'll make it more question-like. — Matthieu M., Dec 03 '10 at 17:44
To the best of my knowledge, Visual C++ does not support VLAs. — James McNellis, Dec 03 '10 at 18:56
VC++ does not support VLAs- you will have to _alloca() them, which prevents any kind of useful vla encapsulation as the user must always allocate the memory manually off their stack frame. — Puppy, Dec 03 '10 at 20:54
VLA shouldn't be a problem to implement if you have a frame pointer together with the stack pointer, am I right...? — Kos, Dec 03 '10 at 23:00
@Matthieu M.:I doubt it's because it's too costly to implement, and more because C++ already has a much safer dynamic array mechanism (`std::Vector`), while C99's variable length arrays can easily cause stack overflows and must be carefully controlled whenever you care about doing things securely (in particular, you must **never** use a user supplied value for a VLA, because that could blow the stack). Plus, in C++ land, you have things like destructors which could potentially need to be called on those VLAs, and that could be **very** complicated ;) — Billy ONeal, Dec 03 '10 at 23:01
As an interesting note: std::vector<> is compiled to exactly the same as arrays in some scenarios (read: when the compiler knows enough and the dynamics are constrained) — Sebastian Mach, Jul 04 '11 at 13:50
[How does GCC implement variable-length arrays?](https://stackoverflow.com/q/21182307/995714) — phuclv, Oct 16 '17 at 01:30

Yakov Galka · Accepted Answer · 2010-12-06T09:58:24.557

4

How VLAs are implemented in gcc / VC++ ?

AFAIK VC++ doesn't implement VLA. It's a C++ compiler and it supports only C89 (no VLA, no restrict). I don't know how gcc implements VLAs but the fastest possible way is to store the pointer to the VLA and its size in the static portion of the stack-frame. This way you can access one of the VLAs with performance of a constant-sized array (it's the last VLA if the stack grows downwards like in x86 (dereference [stack pointer + index*element size + the size of last temporary pushes]), and the first VLA if it grows upwards (dereference [stackframe pointer + offset from stackframe + index*element size])). All the other VLAs will need one more indirection to get their base address from the static portion of the stack.

[ Edit: Also when using VLA the compiler can't omit stack-frame-base pointer, which is redundant otherwise, because all the offsets from the stack pointer can be calculated during compile time. So you have one less free register. — end edit ]

Is the cost really that impressive ?

Not really. Moreover, if you don't use it, you don't pay for it.

[ Edit: Probably a more correct answer would be: Compared to what? Compared to a heap allocated vector, the access time will be the same but the allocation and deallocation will be faster. — end edit ]

edited Dec 06 '10 at 09:58

answered Dec 03 '10 at 22:55

Yakov Galka

70,775
16
139
220

2

MSVC++ does support restrict (although it is spelled `__restrict`). – Crashworks Dec 03 '10 at 23:49
thanks, I thought VC++ supported them, my mistake. wrt to the proposed implementation, it seems there would be a cost for accessing local variables as well (ie dynamic computation of the offset), or am I mistaken ? – Matthieu M. Dec 06 '10 at 07:29
@Matthieu: Of course there is, it costs you one indirection: `[stack-frame-base-pointer + local-variable-offset]`. But it's not connected to VLAs. The point is that for VLA you need **two** indirection. – Yakov Galka Dec 06 '10 at 09:54
I should have clarified, sorry, I am not really interested in the cost of accessing VLAs elements, I guess it to be somewhere between static array and vector. I am more interested in the increase of cost in accessing the other local variables because of the presence of the VLA(s). The question is then: would it happen that using a non-resizable vector could be faster because then the other local variables have a "fixed" position ? Or is it such a degenerate case that if I have the choice, I'd better use VLA (which would be an argument to implement them in compilers). – Matthieu M. Dec 06 '10 at 12:23
@Matthieu: No, because the compiler can put them in the static portion of the stack-frame. – Yakov Galka Dec 06 '10 at 12:41

score 0 · Answer 2 · answered Dec 03 '10 at 23:40

If it were to be implemented in VC++, I would assume the compiler team would use some variant of _alloca(size). And I think the cost is equivalent to using variables with greater than 8-byte alignment on the stack (such as __m128); the compiler has to store the original stack pointer somewhere, and aligning the stack requires an extra register to store the unaligned stack.

So the overhead is basically an extra indirection (you have to store the address of VLA somewhere) and register pressure due to storing the original stack range somewhere as well.

Variable Length Array overhead in C++?

2 Answers2

Linked

Related