Do multi-dimensional arrays cause any problems in C and/or C++?

Question

I know that this question seems a little bit hilarious at the first sight. But as I came across this question I´ve found a comment, of @BasileStarynkevitch, a C and C++ high-level user, in which he claimed that multidimensional arrays shall not be preferable to use, neither in C nor in C++:

Don't use multi-dimensional arrays in C++ (or in C).

Why? Why shouldn´t I use multi-dimensional arrays in C++ nor in C?

What does he meant with this statement?

Thereafter, another user replied on this comment:

Basile is right. It's possible to declare a 3D array in C/C++ but causes too many problems.

Which problems?

I use multi-dimensional arrays a lot and see no disadvantages in using them. In opposite, I think it has only advantages.

Are there any issues about using mulit-dimensional arrays I do not know about?
Can anyone explain me what they meant?

Ignore the comments. There is nothing wrong with multi-dimensional arrays. — Eric Postpischil, Feb 03 '20 at 15:12
They may talk about performance issues. If I remember correctly, doing computations on multidimensionnal array is generally slower than doing those on one dimension array — Missu, Feb 03 '20 at 15:14
It's in general easy to make mistakes when using C-style arrays. Now you take this "easy to make mistakes" to power of n-dimensions. All the typos when iterating over inner array (`i` for `j`). Much less obvious syntax when passing it to function (one size must be passed explicitly, the other as part of the array type...) when passing such array to function. In C++, you want `std::vector>` for easy and *safer* usage. Or `std::vector>` for better performance. Anyway, this question seems opinion-based. — Yksisarvinen, Feb 03 '20 at 15:16
Since any dynamic multi dimensional data structures will require a single allocation for performance (as opposed to nested dynamic data structures) I tend to prefer to use single dimensional arrays typically even in the static case, just so the code I work is always just consistent under the same assumptions of how the memory is allocated and passed around as pointers (manually indexing into it). If you rely something like multidimensional indexing on arrays like `a[4][3][2]` and later wish to swap this out for something dynamic, you'll be left with more refactoring work as a result. — Lemon Drop, Feb 03 '20 at 15:22
@Bob__ But, at least my thought of this statement, is that the container of `std::array` shall be incorporated in this "array" meaning because we can make also multi-dimensional arrays (`std::array` container) from `std::array` by using for example something like this: `std::array, 3> arr = {{{5, 8, 2}, {8, 3, 1}, {5, 3, 9}}};` — RobertS supports Monica Cellio, Feb 03 '20 at 15:36
Voting to close as opinion based. Any such question should ask "why prefer X over Y" rather than "why X is bad". If you ban multi-dimensional arrays (`int x[6][7][8]`) you should ask yourself what will you be using instead; `int x[6*7*8]`? `int ***x;`? `valarray x`? ... And a lot of the alternatives will have the same problems or worse. — Yakov Galka, Feb 03 '20 at 15:37
@ybungalobill So, in turn, you would classify these statements as incorrect and opinion-based? By the way, I personally do not ban them at all nor see a reason to do so. My question is especially because to find out whether there are any issues about using multi-dimensional arrays I did not know about or not. — RobertS supports Monica Cellio, Feb 03 '20 at 15:45
@Yksisarvinen But isn´t `std::vector>` or `std::vector>` at the end a kind of multi-dimensional array, too? My thought about these statements was that they include any kind of multi-dimensional cluster of objects of a certain type. I didn´t meant only C-style arrays. — RobertS supports Monica Cellio, Feb 04 '20 at 11:17

mfnx · Answer 1 · 2020-02-04T07:55:20.913

5

This is quite a broad (and interesting) performance related topic. We could discuss cache misses, cost of initialization of multi-dimensional arrays, vectorization, allocation of multidimensional std::array on the stack, allocation of multidimensional std::vector on the heap, access to the latter two, and so on... .

That said, if your program works fine with your multidimensional arrays, leave it the way it is, especially if your multidimensional arrays allow for more readability.

A performance related example:

Consider a std::vector which holds many std::vector<double>:

std::vector<std::vector<double>> v;

We know that each std::vector object inside v is allocated contiguously. Also, all the elements in a std::vector<double> in v are allocated contiguously. However, not all the double's present in v are in contiguous memory. So, depending on how you access those elements (how many times, in what order, ...), a std::vector of std::vector's can be very slow compared to a single std::vector<double> containing all the double's in contiguous memory.

Matrix libraries will typically store a 5x5 matrix in a plain array of size 25.

edited Feb 04 '20 at 07:55

answered Feb 03 '20 at 15:29

mfnx

2,894
1
12
28

2

*That said, if your program works fine with your multidimensional arrays, leave it the way it is, especially if your multidimensional arrays allow for more readability.* That probably should be the first sentence of this answer. It's almost never a good idea to sacrifice readability for anything. – Andrew Henle Feb 03 '20 at 16:18
@mfnx *a std::vector of std::vector's can be very slow compared to a single std::vector containing all the double's in contiguous memory.* - What would be your preferred alternative? How would you make it more efficient? What is your preferred alternative? – RobertS supports Monica Cellio Feb 03 '20 at 17:49
@RobertSupportsMonicaCellio Depends. Imagine a vector of N triplets of doubles vs a vector of 3*N doubles (typical example in scientific computing). Loop over the triplets and use them. Repeat it 1M times and benchmark it. – mfnx Feb 03 '20 at 18:27
1

@RobertSupportsMonicaCellio My preference goes to readability. If performance becomes an issue, then you may look for optimizations. But how to go abou this depends on your problem which you didn't explain. Therefore, imo, this question is too broad (not "opinion-based" though as their are many aspects which do not depend on opinions, imo ;)). – mfnx Feb 04 '20 at 07:38

score 4 · Accepted Answer · edited Jun 20 '20 at 09:12

4

You cannot answer this question for C and C++ at once, because there is a fundamental difference between these two languages and their handling of multidimensional arrays. So this answer contains two parts:

C++

Multidimensional arrays are pretty useless in C++ because you cannot allocate them with dynamic sizes. The sizes of all dimensions except the outermost one must be compile time constants. In virtually all the usecases for multidimensional arrays I have encountered, the size parameters are simply not known at compile time. Because they come from the dimensions of an image file, or some simulation parameter, etc.

There might be some special cases where the dimensions are actually known at compile time, and in these cases, there is no issue with using multidimensional arrays in C++. In all the other cases, you'll need to either use pointer arrays (tedious to set up), nested std::vector<std::vector<std::vector<...>>>, or a 1D array with manual index computation (error prone).

C

C allows for true multidimensional arrays with dynamic sizes since C99. This is called VLA, and it allows you to create fully dynamically sized multidimensional arrays both on the stack and the heap.

However, there are two catches:

You can pass a multidimensional VLA to a function, but you can't return it. If you want to pass multidimensional data out of a function, you must return it by reference.
```
void foo(int width, int height, int (*data)[width]);  //works
//int (*bar(int width, int height))[width];  //does not work
```
You can have pointers to multidimensional arrays in variables, and you can pass them to functions, but you cannot store them in structs.
```
struct foo {
    int width, height;
    //int (*data)[width];  //does not work
};
```

Both problems can be worked around (pass by reference to return a multidimensional array, and storing the pointer as a void* in the struct), but it's not trivial. And since its not a heavily used feature, only very few people know how to do it right.

Compile time array sizes

Both C and C++ allow you to use multidimensional arrays with dimensions known at compile time. These do not have the drawbacks listed above.

But their usefulness is reduced greatly: There are just so many cases where you would want to use a multidimensional array, and where you do not have the ghost of a chance to know the involved sizes at compile time. An example is image processing: You don't know the dimensions of the image before you have opened the image file. Likewise with any physics simulation: You do not know how large your working domain is until your program has loaded its configuration files. Etc.

So, in order to be useful, multidimensional arrays must support dynamic sizes imho.

edited Jun 20 '20 at 09:12

Community

1
1

answered Feb 03 '20 at 15:26

cmaster - reinstate monica

38,891
9
62
106

4

`You cannot use multidimensional arrays because you cannot allocate them with dynamic sizes.` - Not sure if this is misworded or not, but multidimensional arrays are absolutely real in C++. Maybe change to `You *should not* use`? – alteredinstance Feb 03 '20 at 15:40
@alteredinstance The simple declaration `int width = 7; int (*data)[width] = nullptr;` is not valid C++ of any standard. To be valid C++, `width` must be a compile time constant. – cmaster - reinstate monica Feb 03 '20 at 16:00
@cmaster-reinstatemonica For C++: *You cannot use multidimensional arrays because you cannot allocate them with dynamic sizes. The sizes of all dimensions except the outermost one must be compile time constants. This makes the multidimensional arrays syntax quite worthless in C++.* What about instantiating multidimensional arrays with constant sizes? I can do so with, f.e.: `std::array, 3> arr = {{{5, 8, 2}, {8, 3, 1}, {5, 3, 9}}};`. Why is it worthless to create such multidimensional arrays in C++? – RobertS supports Monica Cellio Feb 03 '20 at 16:08
@RobertSsupportsMonicaCellio You may want to clarify if you are interested in dinamically allocated arrays or compile time constant sized ones, the question is already enough broad as it is, this could help reducing the scope. Also, again, `std::array` is not a plain c-array. – Bob__ Feb 03 '20 at 16:12
@cmaster-reinstatemonica For C: Why dynamically sized here either? Does the problem which the qouted comment point to rely on dynamic sized multidimensional arrays only? I ask this just to follow and understand your answer. – RobertS supports Monica Cellio Feb 03 '20 at 16:16
2

@RobertSsupportsMonicaCellio That's worthless because in virtually all relevant cases, the required dimensions are simply not known at compile time. Imagine doing image processing: Do you really know the sizes of the images you'll be processing at compile time? For me that screams "DYNAMIC SIZES!". I can count the number of cases where I could use a buffer of compile time size on one hand, and most certainly there was not a single case where that compile time sized buffer was two dimensional. – cmaster - reinstate monica Feb 03 '20 at 16:22
@Bob__ I do not know where to dig. All informations I got is that multidimensional arrays shall not be used for any reason(s). I´m looking to find this/these reason(s). I know this is inprecise, broad and I feel a little ashamed to do so but what shall I do else? I´m only looking for the reason behind this statement, nothing more, nothing less. – RobertS supports Monica Cellio Feb 03 '20 at 16:26
@cmaster-reinstatemonica Thank you very much for the explanation. – RobertS supports Monica Cellio Feb 03 '20 at 16:31
1

@RobertSsupportsMonicaCellio I've added a paragraph on the compile time array size case now, which contains my (subjective) view on why they are useless. Hope that serves to clarify my thoughts on the matter. – cmaster - reinstate monica Feb 03 '20 at 16:32
@cmaster-reinstatemonica Could you change "*You **cannot** use multidimensional arrays because you cannot allocate them with dynamic sizes*" to "*You **shall not** use multidimensional arrays because you cannot allocate them with dynamic sizes*"- And what would be your recommended alternatives to a multi-dimensional array for each language, C and C++? Just a huge array, a structure or simply just single objects on their own? – RobertS supports Monica Cellio Feb 03 '20 at 18:07
2

@RobertSsupportsMonicaCellio I'm actually more happy with "cannot": "Shall not" would sound like I'm commanding someone not to use multidimensional arrays. Which I absolutely don't want to do. "Cannot", on the other hand, means that there is a fundamental, unresolveable problem that stands in the way. That problem is the compile time constant array sizes. You could argue that I should be clearer in saying that there are some corner cases where this problem does not entirely spoil the use of the construct, but that's already implicit in "because you cannot allocate them with dynamic sizes". – cmaster - reinstate monica Feb 03 '20 at 22:19
1

@RobertSsupportsMonicaCellio As to what I would use instead: In C, I would indeed use multidimensional arrays and work around the two problems I outlined. In C++, I would use 1D arrays and do the index computation by hand in the style `data[(z*depth + y)*height + x]`. That will compile down to the same code as the `data[z][y][x]` expression in C, it's just a bit harder to read and to get right. I would generally not resort to pointer arrays because they are really tedious to set up and tear down, and are not as performant as the index computation approach anyways. – cmaster - reinstate monica Feb 03 '20 at 22:27
@RobertSsupportsMonicaCellio Ups, that should have been `data[(z*height + y)*width + x]`. Which only goes to show that doing this index computation by hand *is* error prone... – cmaster - reinstate monica Feb 04 '20 at 10:28
1

@cmaster-reinstatemonica I would like to accept this answer but the sentence of `You cannot use multidimensional arrays because you cannot allocate them with dynamic sizes.` is still wrong; doesn´t matter how you think of it. **1.** Although you later state: *Both C and C++ allow you to use multidimensional arrays with dimensions known at compile* and this is kind of contrary to each other; this sentence on its own keeps wrong. I **can** use **static** multi-dimensional arrays stored on the stack in C++, even if its *quite worthless* with values known at compile-time. – RobertS supports Monica Cellio Feb 05 '20 at 10:00
1

**2.** I`m be able to also create dynamic multi-dimensional arrays on the heap with values known at compile-time; doesn´t matter of that would be even more "worthless" or idiotic to do, but I **can** do that without a problem. - I **cannot** - I feel the irony - accept this answer with that sentence, simply because it is not true. Only if you reform, omit the sentence or reform the whole C++ part, I can and want to accept this answer because apart from this sentence and the dedicated C++ part, in my opinion this answer is the most useful one because it shows actual problems with MD arrays. – RobertS supports Monica Cellio Feb 05 '20 at 10:18
3

@RobertSsupportsMonicaCellio Ok, I've rephrased the C++ part now. I hope that it's now acceptable for everyone. (Sorry for being so stiff-necked.) – cmaster - reinstate monica Feb 06 '20 at 09:22
@ggb667 In C++, when your sizes are dynamic, so must your allocation be. – Caleth Feb 20 '20 at 13:35

ggb667 · Answer 3 · 2020-02-20T13:34:44.533

2

Well the "problems" referred to are not using the structure properly, walking off the end of one or another of the dimensions of the array. If you know what you are doing and code carefully it will work perfectly.

I have often used multidimensional arrays for complex matrix manipulations in C and C++. It comes up very frequently in signals analysis and signal detection as well as high performance libraries for analyzing geometries in simulations. I did not even consider dynamic array allocation as part of the question. Even then typically sized arrays for certain bound problems with a reset function could save memory and speed performance for complex analysis. One could use a cache for smaller matrix manipulations in a library and a more complex C++ OO treatment for larger dynamic allocations on a per-problem basis.

edited Feb 20 '20 at 13:34

answered Feb 03 '20 at 15:12

ggb667

1,881
2
20
44

"Harder to use wrong" is a sensible reason to choose something over a multidimensional array – Caleth Feb 03 '20 at 17:37
Overly complex and memory hog are also things however. Learning to be precise and code carefully are also worthy goals, and depending on the application might be required. A blanket "don't do it" for something simple like array manipulations seems wrongheaded to me as a mentor and teacher of new programmers. – ggb667 Feb 20 '20 at 13:26

alteredinstance · Answer 4 · 2020-02-03T15:31:55.057

2

As with most data structures, there is a "right" time to use them, and a "wrong" time. This is largely subjective, but for the purposes of this question let's just assume you're using a 2D array in a place where it wouldn't make sense.

That said, I think there are two notable reasons to avoid using multidimensional arrays in C++, and they mainly arise based on the use cases of the array. Namely:

1. Slow(er) Memory Traversal

A 2-Dimensional array such as i[j][k] can be accessed contiguously, but the computer must spend extra time computing the address of each element - more than it would spend on a 1D array. More importantly, iterators lose their usability in multidimensional arrays, forcing you to use the [j][k] notation, which is slower. One main advantage of simple arrays is their ability to sequentially access all members. This is partially lost with a 2+D array.

2. Inflexible size

This is just an issue with arrays in general, but resizing a multidimensional array becomes much more complex with 2, 3, or more dimensions. If one dimension needs to change size, the entire structure has to be copied over. If your application needs to be resized, its best to use some structure besides a multidimensional array.

Again these are use-case based, but those are both significant issues that could arise by using multidimensional arrays. In both cases above, there are other solutions available that would be better choices than a multi-dimensional array.

edited Feb 03 '20 at 15:31

answered Feb 03 '20 at 15:26

alteredinstance

587
3
17

4

AFAICT with optimizations enabled, the compiler is clever enough to compile a nested-for-loop without having to do any per-item multiplications; so I doubt the [j][k] notation will be measurable slower than a 1D-array's [j] notation. You do need to be careful to nest the for-loops in the correct order so that subsequent memory accesses are adjacent to each other in RAM, though, or you risk introducing a lot of unnecessary cache misses. – Jeremy Friesner Feb 03 '20 at 15:37
@JeremyFriesner For a full-array traversal, you're probably correct. I'd think the real issues arise when the program wants to do random accesses, for instance where [j][k] are not compile-time constants. If j and k are not known at compile time, I'd assume the program will have to perform the address calculation on the fly. – alteredinstance Feb 03 '20 at 15:44
agreed, but that is also true when using a 1D array (you'd just have to do the calculations yourself explicitly: `float val = the1DArray[rowIdx*rowSize+colIdx];` – Jeremy Friesner Feb 03 '20 at 16:46
@JeremyFriesner I certainly agree, that's why it's considered best practice to use iterators with arrays (and not i[j][k]) - however - doing iterators with multidimensional arrays becomes an impracticality or even an impossibility, which is my main reasoning for it being a "slower" data structure – alteredinstance Feb 03 '20 at 17:07
@alteredinstance *In both cases above, there are other solutions available that would be better choices than a multi-dimensional array.* How would you do it else better? What is your recommend alternative to an multi-dimensional array? – RobertS supports Monica Cellio Feb 03 '20 at 17:51

score 1 · Answer 5 · answered Feb 03 '20 at 17:31

The statements are widely applicable, but not universal. If you have static bounds, it's fine.

In C++, if you want dynamic bounds, you can't have a single contiguous allocation, because the dimensions are part of the type. Even if you don't care for a contiguous allocation, you have to be extra careful, especially if you wish to resize a dimension.

Much simpler is to have a single dimension in some container that will manage the allocation, and a multidimensional view

Given:

std::size_t N, M, L;
std::cin >> N >> M >> L;

Compare:

int *** arr = new int**[N];
std::generate_n(arr, N, [M, L]()
{ 
    int ** sub = new int*[M];
    std::generate_n(sub, M, [L](){ return new int[L]; });
    return sub;
});

// use arr

std::for_each_n(arr, N, [M](int** sub)
{ 
    std::for_each_n(sub, M, [](int* subsub){ delete[] subsub; });
    delete[] sub;
});
delete[] arr;

With:

std::vector<int> vec(N * M * L);
gsl::multi_span arr(vec.data(), gsl::strided_bounds<3>({ N, M, L }));

// use arr

Do multi-dimensional arrays cause any problems in C and/or C++?

5 Answers5

C++

C

Compile time array sizes