Initialize a dynamic array in O(1) time in C/C++

Question

Is there any way to initialize a whole dynamic array in O(1) time? Is there something similar to bool a[10] = {false} in the case of a static array?

What do you mean by "dynamic"? `malloc` or `new`? GCC extension like `bool a[var];`? Furthermore, pick C or C++, as obviously the answer will differ. And finally, O(1) is impossible without statically preallocating enough memory for your dynamic array to point to. — Ken Y-N, Jun 16 '17 at 01:53
@Kaz: `bool a[123456] = false` is a `nop`, since it generates a compilation error. — Michaël Roy, Jun 16 '17 at 02:35
@KenYN `bool a[var];` is standard C and not a gcc extension. — Ajay Brahmakshatriya, Jun 16 '17 at 03:03
Bottom line is if the amount of bytes to initialize is larger then the processor can hold in one operation, it takes time to initialize. — Andre, Jun 16 '17 at 08:22

Richard · Answer 1 · 2017-06-16T07:55:54.607

For a dynamic array, each element you want to set must be individually considered by the CPU, so the time complexity is O(N).

Or is it?

But that's not really how big-oh works. It is also not really how CPUs work.

Big-oh notation concerns itself with the asymptotic behavior of a system as the number of elements grows towards infinity. Which is to say: we should take it with a grain of salt when we apply it to small numbers of elements!

CPUs also behave differently for small versus large numbers of elements because they have different cache levels which imply different access latencies. And you see this all the time in designing high-performance algorithms. See, for instance, this page describing how to optimize matrix multiplication: the section on "Blocking to maintain performance" is all about rearranging computations so they stay in the CPU's cache.

Now, let's talk about your question more specifically.

Consider below the handy chart of Latency Numbers Every Computer Scientist Should Know (source).

The salient point here is that a (random) main memory reference costs about 100ns, whereas references to the L1 cache cost 0.5ns and to the L2 cache cost 7ns. Due to caching effects, reading sequentially from RAM gives a significant speed boost.

This means that, all else being equal, you can read 200 numbers from L1, or 14 numbers from L2, in the time it takes to read a single number from RAM.

And this gets us into cost models.

Consider initializing two dynamic arrays like so:

std::vector<int> a(20,1);
std::vector<int> a(21,1);

From the foregoing, we expect that the additional element takes ~0.5ns to deal with (since the whole array fits into the cache) whereas storing the array to memory takes 100ns. That is, the marginal cost of adding an additional element is negligible. In fact, the marginal cost of adding even many elements would be negligible (how many depends on the processor).

Let's apply these ideas. Consider these two statements.

int m = array1[20];
std::vector<int> a(900,1);

If you're thinking about this from a O(1) vs. O(N) perspective you might think something silly, like "the second statement will take 900x longer than the first". A more sophisticated thought you might have is that "the hidden coefficients of the O(N) second statement may be small, so it is difficult to know which will be faster".

With some knowledge of caching you might instead say to yourself "for these small N values an asymptotic analysis is not appropriate". You might further think "these statements may take the same time" or "the second statement may run faster than the first due to caching effects".

An Experiment

We can use a simple experiment to demonstrate this.

The following program initializes arrays of different lengths:

#include <vector>
#include <iostream>
#include <chrono>
#include <string>

int main(int argc, char **argv){
  const int len = std::stoi(std::string(argv[1]));
  auto begin = std::chrono::high_resolution_clock::now();
  for(int i=0;i<10000;i++){
    std::vector<int> a(len,1);
    (void)a;
  }
  auto end = std::chrono::high_resolution_clock::now();
  std::cout << std::chrono::duration_cast<std::chrono::nanoseconds>(end-begin).count() << std::endl; //Nanoseconds
}

The program runs many initializations of the dynamic array to average over fluctuations due to other programs running on the machine.

We run this program many times:

for n in {1..100}; do ./a.out 1; done | awk '{print "1 "$1}' | xclip

To deal with differences in the program's start-up and the machine's memory-state due to other programs running.

On my Intel(R) Core(TM) i5 CPU M480 @ 2.67GHz with a 32K L1 cache, 256K L2 cache, and 3072K L3 cache, the result is this:

Note that the increase in time for small arrays is sublinear (when combined with the later behavior on larger arrays) through about 1000 elements. This is not O(N) behavior. After this, adding 10x more elements leads to about 10x more time consumed. This is O(N) behavior.

Trying the same test on my Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz with a 32K L1 cache, 256K L2 cache, and 30720K L3 cache yielded very similar results:

Bottom Line

Array initialization is in O(N) since it grows with the number of elements. However, in practice, due to caching, the cost does not rise linearly. Therefore, an "O(1)" command may take the same time as an "O(N)" command, depending on the magnitude of N.

In C++, constructors for the initialized data segment are run before entering main, so the initialization of static arrays is not really done by the compiler, — Michaël Roy, Jun 16 '17 at 03:34
Remember, the O(N) — Big O — notation is for asymptotic time as the size of N grows towards infinity. — Jonathan Leffler, Jun 16 '17 at 06:48
@JonathanLeffler: Noted. I tried to make it clear that I was considering smaller arrays, and now make this more explicit. Thank you for stressing this. — Richard, Jun 16 '17 at 07:58

score 4 · Accepted Answer · answered Jun 16 '17 at 02:06

4

No, there is not, not in c++, nor in c, not in any language or CPU I know of. But it can be done with 1 line of code.

in C: for a array of char:

char a[N];
memset(a, '1', N);  // only works for data 1 byte long

in C++

std::vector<T> v;
std::fill(v.begin(), v.end(), value);

answered Jun 16 '17 at 02:06

Michaël Roy

6,338
1
15
19

I have to implement a function in o(n) (to fulfill a specification in a university project) where n is the number of elements of a BST. In this function it needed to use a priority queue (heap) that had previously been implemented, the maximum number of different priorities was bounded to n, but the value which those priorities could have was greater than n. When creating the queue, an array of booleans was set to false (with the max priority value as the size) to determine if there was an element associated with the priority x in the queue at or (1) (constrained by the project specification). – Donato Jun 16 '17 at 02:53
My problem was this: When creating the queue as the max value of priority was greater than n, initializing the boolean array exceeded the maximum order set and the tests failed. I do not know why, but when using memset instead of a for loop to initialize the array everything worked perfectly, what could be the cause? – Donato Jun 16 '17 at 02:53
Is this in c++ or in plain c? If in c++, It could be that the bug is still there. memset famously does not check for bounds, and if your array is not of the correct size, you may have overwritten some other data, and a worse bug may pop up at any random time. – Michaël Roy Jun 16 '17 at 03:10
It's C ++. What do you mean the matrix is not the correct length? The parameter passed to memset is the length of the boolean array. I already compile, I tested it and everything worked well, I passed all the tests. But if we assume that memset is n order as the for loop why does it work ?? – Donato Jun 16 '17 at 03:15
It's very odd and unusual that 2 different ways of doing the same thing produce different results. I have to point out that memset is considered more unsafe than using a loop in c++. have you tried std::fill ? you can use this notation `std::fill(&a[0], &a[N], true);` if you are using an array that was created with new or malloc, it's defined in .Use standard begin() and end() if it's a standard container. – Michaël Roy Jun 16 '17 at 03:24
Just tried but It fails again, seems like only work with memset – Donato Jun 16 '17 at 03:30
In my experience, this can only mean that memset is hiding the bug, not fixing it. I do not recommend you use it, until you find the bug. What kind of error do you have? An access error? an out of bounds exception? – Michaël Roy Jun 16 '17 at 03:41
The error is that the maximum execution time set in the makefile is exceeded – Donato Jun 16 '17 at 03:43
It's difficult for me to help you, since I've never seen the code... What kind of containers do you use? std::vector, or a list of some sort? Are you certain that your code is thread-safe? I have a feeling that your bools may not be 1 byte long, or that they are not in a contiguous vector. From what I understand, your algorithm is proven good at least in the case where max priority <= n, is that right? Were you holding the size of the bool array in a different variable than in the memset case? These are question I'd check. Sorry I can't really help you more than that.... – Michaël Roy Jun 16 '17 at 04:10

score 2 · Answer 3 · answered Jun 16 '17 at 01:53

2

There is no way to initialize an array in less than O(n) time, by the very definition of big-O. CPU needs to fill the array of n elements with n zeros.

When you declare an array bool a[10] = {false} in a static or global context, CPU still spends O(n) time filling it with zeros, but it happens at the time the program is loaded. Similarly, declaring this array in the automatic context (local array variable) results in CPU cycles being spent upon entering the function where the array is declared.

Finally, you can value-initialize a dynamically allocated array using this syntax:

int *array = new int[ARRAY_SIZE](); // Note the parentheses

Again, this is done in O(n) time upon allocation.

answered Jun 16 '17 at 01:53

Sergey Kalinichenko

714,442
84
1,110
1,523

1

Fixed-size arrays may be allocated and initialised at compile time, so load-time cost could also be argued to be O(1) as you are effectively just setting a pointer to loaded data. – Ken Y-N Jun 16 '17 at 01:59
Nothing in the initialized data segment is initialized at compile time. Constructors of static objects need to execute before entering main(). Similarly, the uninitialized data segment is filled with 0. Nothing is really free. – Michaël Roy Jun 16 '17 at 10:15

score 0 · Answer 4 · answered Jun 16 '17 at 06:43

define: tells the compiler there is an array type char and size 100 but not created yet Compiler takes note but not reserving memory space char a[100];

initialize: tells the ompiler there is an array and reserves space of 100 bytes for the array char b[100] = {}; or char b[100] = {0}; the value of each element is undefined, not initialized but available for use.

If you want to set the array to starting value you need to write it manually char c[100] = {}; for(;i < 100; i++){ c[i] = 0; }

So creating the array and reserving space is 1(0); Setting a value takes longer

Tom · Answer 5 · 2020-12-10T07:03:42.800

You can actually do it in O(1) time

I'll explain the simpler solution, while you should read about the better one here,
and see my cpp implementation of the latter.

The next (simpler) algorithm can also be found here.

You want to initialize an array A of size n. You'll need to allocate two more arrays - from and to (array of unsigned ints, of size n), and use two more variables - one to save the "default value", let's call it def, and another one called b.

initialization (or the fill operation) is just setting b=0, and changing def if needed.

The cell i is considered initialized if from[i] (lets call it j) makes that to[j]<b and to[j] == i.
This is called a chain (from[i] == j and to[j] == i, while j<b).

Read by checking if the cell is chained, if so return A[i], else return the default def.

Write by first checking if the cell is chained. if it isn't - chain it:

from[i] = b;
to[b] = i;
b++;

Then just write to A[i].

Why does it work?

After fill all cells are uninitialized, and that's because b=0 (from[i]>=b so i cant be chained).

An uninitialized cell can't accidently form a chain because the to[j] (for j < b) should equal i, but the C[j]'s are already chained to the already initialized cells.

Implementation of that can be found in the link above.
This solution has a big flaw - the 2n extra memory words.
It's a major flaw since we only use this algorithm on big arrays (as small ones might be faster initialized by memset/for-loop).

There is an improved solution that does the same O(1) fill/read/write with only O(1) extra space (explained in the article above), and can also be improved to using only 1 extra bit (as my implementation does).

Initialize a dynamic array in O(1) time in C/C++

5 Answers5

You can actually do it in O(1) time

Why does it work?

Linked