0

I want to initialize a large 2-dimensional array (say 1000x1000, though I'd like to go even larger) to all -1 in C++.

If my array were 1-dimensional, I know I could do:

int my_array[1000];
memset(my_array, -1, sizeof(my_array));

However, memset does not allow for initializing all the elements of an array to be another array. I know I could just make a 1-dimensional array of length 1000000, but for readability's sake I would prefer a 2-dimensional array. I could also just loop through the 2-dimensional array to set the values after initializing it to all 0, but this bit of code will be run many times in my program and I'm not sure how fast that would be. What's the best way of achieving this?

Edited to add minimal reproducible example:

int my_array[1000][1000];
// I want my_array[i][j] to be -1 for each i, j
kccu
  • 283
  • 2
  • 10
  • Please provide [mcve] - the declared 2D array and how would you like to initialize it. If you are concerned about speed, the 2D array can be stored inside the 1D and indexed appropriately. – Quimby Jul 16 '19 at 15:23
  • Loop throu two dimensions. As long as you don't do anything fancy, your optimizer should optimize this code just fine. – Radosław Cybulski Jul 16 '19 at 15:23
  • Note that 1,000,000 `int`s will exceed the available Automatic Storage on a lot of systems. – user4581301 Jul 16 '19 at 15:28

3 Answers3

1

With GNU GCC you can:

int my_array[1000][1000] = { [0 .. 999] = { [0 .. 999] = -1, }, };

With any other compiler you need to:

int my_array[1000][1000] = { { -1, -1, -1, .. repeat -1 1000 times }, ... repeat { } 1000 times ... };

Side note: The following is doing assignment, not initialization:

int my_array[1000][1000];
for (auto&& i : my_array) 
    for (auto&& j : i)
        j = -1;

Is there any real difference between doing what you wrote and doing for(int i=0; i<1000; i++){ for(int j=0; j<1000; j++){ my_array[i][j]=-1; } }?

It depends. If you have a bad compiler, you compile without optimization, etc., then yes. Most probably, no. Anyway, don't use indexes. I believe the range based for loop in this case roughly translates to something like this:

for (int (*i)[1000]  = my_array; i < my_array + 1000; ++i)
   for (int *j = *i; j < *i + 1000; ++j)
        *j = -1;

Side note: Ach! It hurts to calculate my_array + 1000 and *i + 1000 each loop. That's like 3 operations done each loop. This cpu time wasted! It can be easily optimized to:

for (int (*i)[1000]  = my_array, (*max_i)[1000] = my_array + 10000; i < max_i; ++i)
   for (int *j = *i, *max_j = *i + 1000; j < max_j; ++j)
        *j = -1;

The my_array[i][j] used in your loop, translates into *(*(my_array + i) + j) (see aarray subscript operator). That from pointer arithmetics is equal to *(*((uintptr_t)my_array + i * sizeof(int**)) + j * sizeof(int*)). Counting operations, my_array[i][j] is behind the scenes doing multiplication, addition, dereference, multiplication, addition, derefence - like six operations. (When using bad or non-optimizing compiler), your version could be way slower.

That said, a good compiler should optimize each version to the same code, as shown here.

And are either of these significantly slower than just initializing it explicitly by typing a million -1's?

I believe assigning each array element (in this particular case of elements having the easy to optimize type int) will be as fast or slower then initialization. It really depends on your particular compiler and on your architecture. A bad compiler can do very slow version of iterating over array elements, so it would take forever. On the other hand a static initialization can embed the values in your program, so your program size will increase by sizeof(int) * 1000 * 1000, and during program startup is will do plain memcpy when initializing static regions for your program. So, when compared to a properly optimized loop with assignment, you will not gain nothing in terms of speed and loose tons of read-only memory.

KamilCuk
  • 120,984
  • 8
  • 59
  • 111
  • Thanks. I've never seen `auto` before. Is there any real difference between doing what you wrote and doing `for(int i=0; i<1000; i++){ for(int j=0; j<1000; j++){ my_array[i][j]=-1; } }`? And are either of these significantly slower than just initializing it explicitly by typing a million `-1`'s? – kccu Jul 16 '19 at 15:39
1

I am a little bit surprised.

And I know, it is C++. And, I would never use plain C-Style arrays.

And therefore the accepted answer is maybe the best.

But, if we come back to the question

int my_array[1000];
memset(my_array, -1, sizeof(my_array));

and

int my_array[1000][1000];
// I want my_array[i][j] to be -1 for each i, j

Then the easiest and fastest solution is the same as the original assumption:

int my_array[1000][1000];
memset(my_array, -1, sizeof(my_array));

There is no difference. The compiler will even optimze this away and use fast assembler loop instructions.

Sizeof is smart enough. It will do the trick. And the memory is contiguous: So, it will work. Fast. Easy.

(A good compiler will do the same optimizations for the other solutions).

Please consider.

A M
  • 14,694
  • 5
  • 19
  • 44
0

If the array is static, it's placed in sequential memory (check this question). So char [1000][1000] is equal to char [1000000] (if your stack can hold that much).

If the array has been created with multidimensional new (say char(*x)[5] = new char[5][5]) then it's also contiguous.

If it's not (if you create it with dynamic allocations), then you can use the solutions found in my question to map a n-dimension array to a single one after you have memsetd it.

Michael Chourdakis
  • 10,345
  • 3
  • 42
  • 78
  • 1
    What do you mean by *If the array is static, it's placed in sequential memory.*? All arrays are in contiguous memory. – NathanOliver Jul 16 '19 at 15:21
  • He possibly meant possibility of something like this `int *array[1000]`, where every value is initialized to another array. – Radosław Cybulski Jul 16 '19 at 15:22
  • @NathanOliver, when you have a dynamic multidimensional array, you can't have it in contiguous memory. `char** a; a = new char[10] ; for(int i = 0 ; i < 10 ; i++) a[i] = new char[100];` - Not continuous memory. – Michael Chourdakis Jul 16 '19 at 15:22
  • Is it really guaranteed that `char[10][10]` is continuous? – Quimby Jul 16 '19 at 15:26
  • @Quimby according to [this](https://stackoverflow.com/questions/2565039/how-are-multi-dimensional-arrays-formatted-in-memory) – Michael Chourdakis Jul 16 '19 at 15:27
  • @MichaelChourdakis That is not a dynamic multidimensional array. That is just an array of pointers. A dynamic multidimensional array looks like [this](http://coliru.stacked-crooked.com/a/29d5a3ad9ad34b56) and is contiguous. – NathanOliver Jul 16 '19 at 15:28
  • @MichaelChourdakis Not really a guarantee, I did not find anything about this in the standard. I don't really expect a compiler to do otherwise, but relying on whether an array is static or not seems dangerous. – Quimby Jul 16 '19 at 15:34
  • `An array type describes a contiguously allocated nonempty set of objects with a particular member object type, called the element type. The element type shall be complete whenever the array type is specified.` C Standard, 6.2.5, paragraph 20 – Michael Chourdakis Jul 16 '19 at 15:35
  • @Quimby See [this](https://timsong-cpp.github.io/cppwp/dcl.array#1.sentence-6). That guarantees an array is contiguous and since it is than an array of arrays has all of the sub arrays in contiguous memory. – NathanOliver Jul 16 '19 at 15:36
  • 1
    @MichaelChourdakis Regarding your update: `char** x = new char[5][5]` is not legal. `new char[5][5]` returns a `char(*)[5]`, not a `char**`. – NathanOliver Jul 16 '19 at 15:38