Understanding the handling Variable-Length Data with special focus on variable length arrays in C(99)

Question

Below is an excerpt from the red dragon book. It deals with the handling of variable length data items in the activation record of the procedure.

Pascal is almost unique among languages in requiring that arrays local to a procedure have a length that can be determined at compile time. More often, the size of a local array may depend on the value of a parameter passed to the procedure. In that case, the size of all the data local to the procedure cannot be determined until the procedure is called.

A common strategy for handling variable-length data is suggested in Fig. 7.15, where procedure p has three local arrays. The storage for these arrays is not part of the activation record for p; only a pointer to the beginning of each array appears in the activation record. The relative addresses of these pointers are known at compile time, so the target code can access array elements through the pointers.

Also shown in Fig. 7.15 is a procedure q called by p. The activation record for q begins after the arrays of p, and the variable-length arrays of q begin beyond that.

In the first portion of excerpt above the text talks about the feature of the Pascal programming language and then they talk about a possible implementation of the same. Now I am not acquainted with Pascal and thought of understanding the way the situation is handled in C.

I know that arrays can be created dynamically in C using the malloc and its sister functions, which causes allocation of memory on heap and the the pointer to the first byte is returned to us. Not an issue at all.

If we create arrays in C where the size of the array is a constant as in:

int function() {
   int a[100];
  }

then the array is placed in the local data section of the activation record shown below:

In the above example, the size of the array a is know at compile time. No issues with that.

Now the situation : "More often, the size of a local array may depend on the value of a parameter passed to the procedure. In that case, the size of all the data local to the procedure cannot be determined until the procedure is called."

Case 1:

Now let us consider the code below:

  int function(int n){
           int a[n];
  }
  int main() {
        function(10);
  }

Now in the situation above the size of the parameter n to the function can be known at compile time and hence the size of the array a though variable can be known at compile time. With this logic does C allocate the above array a in the manner as shown in fig 7.15 in the dragon book?

Case 2:

  int function(int n){
           int a[n];
  }
  int main() {
        int x;
        scanf("%d",&x);
        function(x);
  }

Now in the situation above the size of the parameter n to the function can be known only at runtime (?) or is it still the situation as above, i.e. known at compile time ? Now in this code, where is the array a allocated. Heap or stack?

I went here, here and here but did not find the explanation which I am looking for...

Here I am talking about C99

Your function in case 1 has external linkage (and therefore could be called from some other translation unit), so the compiler doesn't know what the value of `n` is, and it must therefore include code in the object file equivalent to the code in case 2. However, it is not obliged to *use* that code in case 1; it could inline the visible call, and of course in the inline code the value of `n` could be substituted with its known value. But since there is no longer a function call, I don't think the inline'd case is relevant to your question. — rici, Mar 31 '21 at 20:06
My rather dog-eared copy of the Dragon Book (first edition, but not the first printing and I couldn't tell you for sure when it was printed) does not have the sentence you quoted about Pascal being almost unique. What my copy says is "Languages like Pascal require arrays local to a procedure...". Also, that paragraph is a couple of pages earlier than the second paragraph in your quote. Certainly, when the book was written (in the early 1980s, I guess) C also required arrays to have a size fixed at compile-time. VLAs weren't standardized until 1999. — rici, Mar 31 '21 at 20:28
@rici I too don't know about the printing of the exact US edition as mine is a low price edition made for students in our country and that too is quite old (I have picked a cheap used copy). And yes the first paragraph was few pages earlier than the second paragraph. In between there were other stuffs about variable no.of parameter passing. But one thing could you please explain this part :`it must therefore include code in the object file equivalent to the code in case 2.` which `code` are you referring to, could you please say me...? — Abhishek Ghosh, Mar 31 '21 at 20:53
I'm not in the US either (I bought my copy in Europe). The code I'm referring to is the code generated by the compiler. That is to say, the compiler cannot make assumptions about what arguments will be given to an externally-linked function, so it must generate the same code for the function regardless of how it appears to be called in the translation unit being compiled. — rici, Mar 31 '21 at 20:56

score 2 · Accepted Answer · answered Mar 31 '21 at 18:21

In general, C implementations allocate VLAs on the stack, not in a separate heap data structure, but that is not actually required by the standard. Since the size of the VLA is only known at runtime, that means you get a variable-sized activation record for such a function. The details of how an activation record is allocated and laid out tends to be dependent on the target machine and its ABI, so trying break it down (as the diagram you have above) is going to be very different for different compilers and architectures.

Understanding the handling Variable-Length Data with special focus on variable length arrays in C(99)

1 Answers1