Is there a good reason why VLA are not permitted in pointers in structs?

Question

Here is a way to define a Matrix type

typedef struct {
    int nr, nc;
    double *elem;
} Matrix;

I would like to define this

typedef struct {
    int nr, nc;
    double elem[nr][nc];
} Matrix;

It would be nice, because I would not have to worry about indexes. That's the reason VLA are useful in the first place, since they only do transparently what would be easy with index arithmetic.

Of course, the above is not possible, if only because the size of the struct would not be well defined. Then, I would still be happy with:

typedef struct {
    int nr, nc;
    double (*elem)[nc];
} Matrix;

Now, the matrix data is stored as a pointer, like in the non-VLA case. But the arithmetic could still be done by the compiler. The definition only tells it's some kind of pointer to double data, with the doubles arranged in an array of width nc.

It seems that it's not permitted either by the standard, and I wonder why, since it's easy to do the same by transtyping. For example, using the first definition (with double *), I could do

double get(Matrix *a, int i, int j) {
    int nc = a->nc;
    double (*p)[nc] = (double (*)[nc])a->elem;
    return p[i][j];
}

Of course, it's not very interesting here, since there is only one access to elem, but it could be if there are many.

So, my question, with the hope that it's on topic: what's the very reason of prohibiting the third definition?

I could imagine that it's dangerous since it's not guaranteed that nc handles the correct value, but this is dangerous anyway with pointers, so it does not look like a good reason.

why not use an array with one element? and linearize your matrix? — Alexander Oh, Sep 14 '15 at 13:12
@Olaf and Alex. I'm not sure I understand. How do you write this to solve the problem, exactly? — , Sep 14 '15 at 13:15
The problem is that, apart from structures containing a flexible array member, the size of a structure is fixed at compile time, so that the compiler knows how to lay out an array of the structure, for example. — Jonathan Leffler, Sep 14 '15 at 13:19
@JonathanLeffler That's why the second definition is wrong. But the third uses a pointer, so the size is known (the doubles are not part of the struct). It's just adding some information to a `double *` so that the arithmetic is done transparently. — , Sep 14 '15 at 13:22
The size of the type in the flexible array member has to be known at compile time, and is not in when you try to build the `nc` in to type. — Jonathan Leffler, Sep 14 '15 at 13:26
side-note: `(*elem)[nc];` <-- you don't need the brackets. C will first look right (`[nc]`), then left `double *` for the type — Elias Van Ootegem, Sep 14 '15 at 13:27
@EliasVanOotegem: The parentheses are crucial. It's the difference between an array of `double *` and a pointer to an array of `double`. — Jonathan Leffler, Sep 14 '15 at 13:29
@EliasVanOotegem Unless I'm badly wrong, `double *a[nc]` is an array of pointers to double, whereas `double (*a)[nc]` is a pointer to a two dimensional array whose second dimension is known to be nc. — , Sep 14 '15 at 13:29
@Jean-ClaudeArbaut: `double (*a)[nc]` is a pointer to a 1-dimensional array, but you're correct that the parentheses are crucial. Your problem is that the `nc` cannot be used as the dimension of an array in a structure, even though it could be used with a local variable or dynamically allocated variable (array) outside a structure. — Jonathan Leffler, Sep 14 '15 at 13:33
@JonathanLeffler Yes, sorry, it's a pointer to a 1d array, but it's a bit tricky since usually I use double* for 1d arrays, and double(*)[n2][...][nm] for m-dim array with n1 not known (that is, though it's a pointer to an array of dim m-1, you can work as if it were a m-dim array). I hope I'm clear :) — , Sep 14 '15 at 13:37
@JonathanLeffler So, as I view it, I can't "bound" the array dimension to a struct field, but I still can do it whenever I use the struct in a function, by transtyping to a VLA type, using the struct field (with the struct passed as an argument). It's sad, because I don't see what's wrong with telling the compiler I want this trick to be done everywhere, and that's what the third definition would really do. I just wanted to know if there is a good reason for that. — , Sep 14 '15 at 13:42
With a VLA, currently the size is evaluated when the declaration executes. For a struct, when is that? — user253751, Sep 14 '15 at 13:44
@Jean-ClaudeArbaut: Nips, wasn't paying attention (I thought you wanted an array of pointers). But `double (*elem)[nc]` is a pointer to to a 1D array. The pointer _can_ point to an allocated block of memory that you can use as an array, but it needn't be — Elias Van Ootegem, Sep 14 '15 at 13:50
@immibis That's only important if VLA are only possible inside a function, thus you need to enter the function to know the size (either of an argument, using another argument, or a local variable, which will be allocated upon function entry). But VLA are really about computing indexes transparently (otherwise, it's the same data as a simple double*, and you could allocate "by hand"). Then, it's perfectly wise to encapsulate the size in a struct, instead of passing as a separate argument. That's like Fortran77 arrays vs Fortran90 arrays: the latter are passed together with dimensions. — , Sep 14 '15 at 13:50
@Jean-ClaudeArbaut Mainly it introduces all sorts of weird cases. Imagine `printf("%p\n", sizeof(*s->elem)); s->nc++; printf("%p\n", sizeof(*s->elem));` - did we just change an object's type at runtime?! — user253751, Sep 14 '15 at 13:52
@immibis Well, VLA type do change at runtime, between calls. Here it could change during the same call, by (wrongly) changing `nc`, though the data would not change. Maybe it's too weird. — , Sep 14 '15 at 14:10
@Jean-ClaudeArbaut Even with normal VLAs an object has a fixed size (and a pointer points to an object of fixed size) after it's created, so this is even weirder. — user253751, Sep 14 '15 at 14:11

score 4 · Accepted Answer · answered Sep 14 '15 at 14:00

Does this meet your requirements? It stores a void * in the structure, and the access functions cast that to a pointer to a 2D VLA and use that. GCC 5.2.0 on Mac OS X 10.10.5 compiles it cleanly, and valgrind (3.11.0-SVN from November 2014 or thereabouts) gives it a clean bill of health.

#include <stdio.h>
#include <stdlib.h>

typedef struct
{
    int nr, nc;
    void *data;     // Actually double a[nr][nc]
} Matrix;

static double get(Matrix *a, int i, int j)
{
    double (*array)[a->nr][a->nc] = a->data;
    return (*array)[i][j];
}

static void set(Matrix *a, int i, int j, double v)
{
    double (*array)[a->nr][a->nc] = a->data;
    (*array)[i][j] = v;
}

static Matrix *mat_alloc(int nr, int nc)
{
    Matrix *m = malloc(sizeof(*m));
    if (m != 0)
    {
        m->nr = nr;
        m->nc = nc;
        m->data = malloc(nr * nc * sizeof(double));
        if (m->data == 0)
        {
            free(m);
            m = 0;
        }
    }
    return m;
}

static void mat_free(Matrix *m)
{
    free(m->data);
    free(m);
}

int main(void)
{
    int nr = 3;
    int nc = 5;

    Matrix *m = mat_alloc(nr, nc);
    if (m == 0)
    {
        fprintf(stderr, "Matrix allocation for %dx%d matrix failed\n", nr, nc);
        exit(1);
    }

    for (int i = 0; i < nr; i++)
    {
        for (int j = 0; j < nc; j++)
        {
            double v = (i * (nc + 1)) + j + 1;
            set(m, i, j, v);
            printf("Set: [%d,%d] = %4.1f\n", i, j, v);
        }
    }

    for (int j = 0; j < nc; j++)
    {
        for (int i = 0; i < nr; i++)
            printf("Get: [%d,%d] = %4.1f\n", i, j, get(m, i, j));
    }

    mat_free(m);
    return 0;
}

I'm not sure whether there's a neat way to lose the (*array) part of the notation in the access functions. I'd prefer it if there was one (other than using array[0][i][j], that is).

Example run

Set: [0,0] =  1.0
Set: [0,1] =  2.0
Set: [0,2] =  3.0
Set: [0,3] =  4.0
Set: [0,4] =  5.0
Set: [1,0] =  7.0
Set: [1,1] =  8.0
Set: [1,2] =  9.0
Set: [1,3] = 10.0
Set: [1,4] = 11.0
Set: [2,0] = 13.0
Set: [2,1] = 14.0
Set: [2,2] = 15.0
Set: [2,3] = 16.0
Set: [2,4] = 17.0
Get: [0,0] =  1.0
Get: [1,0] =  7.0
Get: [2,0] = 13.0
Get: [0,1] =  2.0
Get: [1,1] =  8.0
Get: [2,1] = 14.0
Get: [0,2] =  3.0
Get: [1,2] =  9.0
Get: [2,2] = 15.0
Get: [0,3] =  4.0
Get: [1,3] = 10.0
Get: [2,3] = 16.0
Get: [0,4] =  5.0
Get: [1,4] = 11.0
Get: [2,4] = 17.0

I have tried something like this, but thought it was not very clean: the `array[0][i][j]` makes me view it as a 3d array. Is there something wrong in using a pointer as the first index? It does not allow bound checks though. — , Sep 14 '15 at 14:07
I agree that `array[0][i][j]` is 'misleading'; that's why the code is written using `(*array)[i][j]`. I'm not clear what you mean by 'it does not allow bound checks'. It would be trivial to validate `i` and `j` against `a->nr` and `a->nc` — in fact, I originally had `#include ` in the headers with the intent to verify. I agree that it is not entirely desirable. However, it would be perfectly feasible to write a matrix multiplication function, for example, that took in two matrices and returned an appropriately sized result matrix, and the body would just use matrix notation. — Jonathan Leffler, Sep 14 '15 at 14:11
What I mean is that a clever compiler could check bounds either at compile time or at runtime, using known bounds (`nr` and `nc`). However, when the first dimension is a pointer, you know nothing about it: it's like an infinite array of width `nc`. That's because, in C, a[n] is valid both when a is an array and when it's a pointer (btw, n[a] is equally valid). — , Sep 14 '15 at 14:23

score 1 · Answer 2 · answered Sep 14 '15 at 15:41

I believe that within a function in which a local variable nc is defined, you could use a typedef to create a local type double (*arr)[nc], and then cast a *double to that type. I believe such a cast would be legitimate for any *double that identifies a sufficiently-long sequence of double values, without regard for whether it was created using the same type as is defined within the function [if multiple functions each define their own array type, the compiler wouldn't recognize those types as equivalent, but it shouldn't matter]. I'm not 100% sure there wouldn't be Strict Aliasing issues, but I don't think there should be.

Otherwise, a fundamental difficulty is that a typedef involving a VLA creates a type using values that exist at a specific moment in time, and that can only occur for typedefs which are evaluated as executable statements, which in turn can only happen when typedefs are embedded within functions. Further, any identifiers used within array dimensions will be evaluated in the context of the enclosing function, rather than in the context of the partially-defined type.

Is there a good reason why VLA are not permitted in pointers in structs?

2 Answers2

Example run

Linked