0

I am trying to create an array with the following requirements:

  • internal linkage (static global)
  • size only known at runtime
  • elements accessed using the [][] syntax
  • stored on the heap

I have been using the following code to create a VLA which meets almost of my requirements, but this array is limited to the current scope rather than having internal linkage.

int (*array_name)[columns] = malloc( sizeof(int[rows][columns]) );

Is there any way to create an array which meets all of my needs?


Edit - "static global" is an incorrect term for this type of variable scope, "internal linkage" is correct. Please look at the comments to this question for an explanation.

JShorthouse
  • 1,450
  • 13
  • 28
  • You don't know rows when globals are created, so no. – stark Jun 15 '20 at 17:27
  • This is generally regard as poor design, and you should avoid it. If it were necessary, you could achieve the desired affect by defining `int *pointer`, initializing `pointer` by allocating memory when the size is known, and defining a macro `#define array_name ((int (*)[rows]) pointer)`. That will cause `array_name` to be substituted with a conversion of `pointer` to the desired type, which can be accessed as `array_name[i][j]`. `rows` will also have to be published externally. – Eric Postpischil Jun 15 '20 at 17:35
  • @EricPostpischil Which part is poor design, the requirements of the array itself or the fact that it is global? I'm new to C and I think I may have used the wrong word, I think "static" is what I meant. I don't want to array to be visible to the entire program, just visible within one source file so that multiple functions can use it. – JShorthouse Jun 15 '20 at 17:41
  • @EricPostpischil I think I will go with the solution you have described here. Would you be able to write it up as an answer so I can accept it? – JShorthouse Jun 15 '20 at 17:54
  • 1
    Visible to the entire program is the bad part. Variable length arrays are fine. Static (internal linkage, limited to one translation unit) is better than external. It should also be avoided, but there are more reasonable uses of static than external. However, both static and external present the same problems with setting the type of the variable length array; you will have to use either manual indexing or casting. – Eric Postpischil Jun 15 '20 at 17:54
  • @EricPostpischil Alright, thanks for the clarification and advice – JShorthouse Jun 15 '20 at 17:55
  • @EricPostpischil "*Visible to the entire program is the bad part.*" - But it still keeps a requirement. No place for personal opinions if your employer wants it that way or you need to have the array visible in another TLU. – RobertS supports Monica Cellio Jun 16 '20 at 09:11
  • @RobertSsupportsMonicaCellio: Identifiers being visible to the entire program is objectively bad, not a personal opinion, as it manifestly increases the opportunity for mistakes, as well as for name conflicts. – Eric Postpischil Jun 16 '20 at 10:44
  • @JShorthouse I noticed that you reedited the questing with "static global". - "global" means it have external linkage (global scope) - visible throughout the whole program, "static" means it have internal linkage (file scope) - visible only in the file where it was defined. You can't use both, they contradict each other. No variable or function can be static and global at the same time. That's why a lot of confusion came up at answering this question as well. Don't constantly intermix those. – RobertS supports Monica Cellio Jun 16 '20 at 21:45
  • @RobertSsupportsMonicaCellio Is "internal linkage" the only way to refer to this type of variable then? I searched "static global" and found many questions and answers on this site using this exact phrasing so I assumed it was a common thing. I assume "static" by itself is not synonymous with "internal linkage" as obviously static variables can be declared only within a certain function. – JShorthouse Jun 16 '20 at 21:51
  • Also from a "searchability" perspective, as a new C user I would not know to search for "internal linkage" to find this question. I was not even aware of this term before asking this question. I think "static global" will make it easier for others to find this question, as those are the terms that I used to try and find existing questions before creating this one. – JShorthouse Jun 16 '20 at 21:58
  • 1
    @JShorthouse I see now what you mean. Well, let's go into the depth. What is referred to by "static global" is a variable or function which is static - visible to the file-only but unfortunately - and now it comes - defined in a place we call global scope (out of any function) where variables usually indeed have external linkage. The term "static global" is IMHO totally misplaced and only created in ambiguity in the meaning of global scope (inside of a file). But to view it from a technical part, you can't be both - global and static (local) at the same time. This makes sense, doesn't it? – RobertS supports Monica Cellio Jun 16 '20 at 22:12
  • 1
    "*I assume "static" by itself is not synonymous with "internal linkage" as obviously static variables can be declared only within a certain function.*" - The `static` keyword has two uses in C. Local variable persistence (storage class) - inside of a function - **or** clarify linkage (dependent on the use case). It's a little confusing. – RobertS supports Monica Cellio Jun 16 '20 at 22:23
  • 1
    Take a look at here as well https://stackoverflow.com/questions/572547/what-does-static-mean-in-c – RobertS supports Monica Cellio Jun 16 '20 at 22:26
  • 1
    @JShorthouse: As much as novices might search for “static global” rather than “internal linkage,” it is wrong. [“Global” means visible throughout the program.](https://en.wikipedia.org/wiki/Glossary_of_computer_science#G) “Static” has several meanings both as an English word and as a technical computer science term. Primarily, it means unchanging, persistent, or not needing to be refreshed, and this alludes to the fact that a `static` object in a function maintains its value between calls, whereas a non-`static` object is not specified to maintain its value… – Eric Postpischil Jun 16 '20 at 23:23
  • 1
    … The secondary effect of the `static` keyword as causing visibility only within the translation unit is an accident of history, a consequence of how the language developed. Were we designing a language freshly with today’s knowledge, we would use different keywords for objects inside functions that persist (`static`) and identifiers in translation units that are not published outside them (perhaps `internal`). The term “internal linkage” is defined in the C standard, so it is preferred both because of that and because it is more accurately descriptive of the meaning. – Eric Postpischil Jun 16 '20 at 23:26
  • 1
    @RobertSsupportsMonicaCellio @ EricPostpischil Thank you both for your great explanations, I understand clearly now. I am torn because as much as technical correctness is important I also think a beginner question is useless if it is not phrased in a way that a beginner would ask / search for. I have therefore left the incorrect phrasing in the title but have added a couple of pointers to the correct term in the question itself, I think that this is a good compromise. Thanks both again. – JShorthouse Jun 17 '20 at 13:06

3 Answers3

2

The requested properties can be accomplished as described below. (This is not a recommendation to do so.)

Define a base pointer and an array size:

static void *MyArrayPointer;
static size_t Columns;

When the array size is known, initialize them:

Columns = some value;
MyArrayPointer = malloc(Rows * Columns * sizeof(int));
if (!MyArrayPointer) ... Handle error ...

Define a macro to serve as the array:

#define MyArray ((int (*)[Columns]) MyArrayPointer)

Once the above is complete, the array may be accessed as MyArray[i][j].

Note that variable length array support is optional in C. GCC and Clang support them. Given the example shown in the question, we presume variable length array support is available.

Also, I would be tempted to write the malloc code:

MyArrayPointer = malloc(Rows * sizeof *MyArray);

This has the advantage of automatically adjusting the allocation in case the type used in MyArray ever changes.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
1

You can define the array as follows:

int **array;

And allocate it like this:

array = malloc(rows * sizeof (int *));
for (i=0; i<rows; i++) {
    array[i] = malloc(cols * sizeof(int));
}
dbush
  • 205,898
  • 23
  • 218
  • 273
  • Can this be accessed with `[][]` or does it require manual pointer maths? – JShorthouse Jun 15 '20 at 17:28
  • @JShorthouse You can use that syntax. What you technically have is an array of `int *`, each of which points to an array of `int`. – dbush Jun 15 '20 at 17:29
  • 3
    Nobody competent does this “double pointer” nonsense in production code. Pretty much nobody competent does this in personal/debugging/scratch code. It is bad for performance, wasteful, and unnecessary. While OP’s desire for both variable length and external linkage is not well satisfied by C, good code either uses manual indexing (i•size+j) or casts (publish a base pointer externally, convert to `int (*)[size]` everywhere it is used). Although OP ought to be dissuaded from using externally defined objects at all. – Eric Postpischil Jun 15 '20 at 17:30
  • You should use the `sizeof` operation in the argument first to ensure `size_t` arithmetic. – RobertS supports Monica Cellio Jun 15 '20 at 17:46
  • @EricPostpischil I've thought that would be a common used trick to at least accomplish 2D notation. Not pretty, but it does what it should. – RobertS supports Monica Cellio Jun 15 '20 at 17:52
  • 3
    @RobertSsupportsMonicaCellio: It is unfortunately common on Stack Overflow, which is a sad testament, but I have never seen it used in production code. It does not do what it should: Code should be reasonably efficient, and “double pointers” are not. They force pointer lookups, break processor lookahead, interfere with cache patterns, waste memory, and waste time setting up and tearing down. There is no reason to use them for regular arrays except laziness or ignorance. – Eric Postpischil Jun 15 '20 at 17:57
  • @EricPostpischil Does that you mean you personally don't use pointer to pointers - I use this term to avoid confusion with `double*` - at all (except for maybe as parameter to change a pointer in a caller from a called function)? – RobertS supports Monica Cellio Jun 15 '20 at 18:21
  • 1
    @RobertSsupportsMonicaCellio: I use a pointer to a pointer where I need a pointer to a pointer, such as when a routine needs to change a pointer, so it needs a reference to that pointer. I do not use pointers to pointers to implement two-dimensional arrays. – Eric Postpischil Jun 15 '20 at 18:24
  • @EricPostpischil Good to know. Thank you. – RobertS supports Monica Cellio Jun 15 '20 at 20:19
  • @EricPostpischil Treating a `int *` as a `int (*)[size]` seems fishy. Given `int a[3][3]` accessing `a[1][3]` is undefined behavior because you run past the bounds of the one of the inner arrays even though you're still inside the larger array. A cast like that seems like something an aggressive optimizer could wreak havoc with. – dbush Jun 16 '20 at 22:37
1

No, it's not possible to create an array like that. You cannot create VLA:s in global space, because globals are static and static objects needs to have their sizes defined at compile time.

What you can do is to use something like this:

int **arr;

int foo() // Or main, or any function
{
    arr = malloc(sizeof(*arr) * rows);
    for(int i=0; i<rows; i++)
        arr[i] = malloc(sizeof(*arr[0]) * cols);

Of course, rows and cols needs to be declared and initialized and you should also check if the allocation failed. I'm omitting that here.

And yes, you can use []. The bracket operator is just syntactic sugar for pointer aritmetics. a[i] is the same as *(a+i).

klutt
  • 30,332
  • 17
  • 55
  • 95