Optimal way to append to numpy array

Question

I have a numpy array and I can simply append an item to it using append, like this:

numpy.append(myarray, 1)

In this case I just appended the integer 1.

But is this the quickest way to append to an array? I have a very long array that runs into the tens of thousands.

Or is it better to index the array and assign it directly? Like this:

myarray[123] = 1

Does this answer your question? [Fastest way to grow a numpy numeric array](https://stackoverflow.com/questions/7133885/fastest-way-to-grow-a-numpy-numeric-array) — user202729, Oct 25 '21 at 00:55

Roger Fan · Accepted Answer · 2014-09-03T17:12:25.050

Appending to numpy arrays is very inefficient. This is because the interpreter needs to find and assign memory for the entire array at every single step. Depending on the application, there are much better strategies.

If you know the length in advance, it is best to pre-allocate the array using a function like np.ones, np.zeros, or np.empty.

desired_length = 500
results = np.empty(desired_length)
for i in range(desired_length):
    results[i] = i**2

If you don't know the length, it's probably more efficient to keep your results in a regular list and convert it to an array afterwards.

results = []
while condition:
    a = do_stuff()
    results.append(a)
results = np.array(results)

Here are some timings on my computer.

def pre_allocate():
    results = np.empty(5000)
    for i in range(5000):
        results[i] = i**2
    return results

def list_append():
    results = []
    for i in range(5000):
        results.append(i**2)
    return np.array(results)

def numpy_append():
    results = np.array([])
    for i in range(5000):
        np.append(results, i**2)
    return results

%timeit pre_allocate()
# 100 loops, best of 3: 2.42 ms per loop

%timeit list_append()
# 100 loops, best of 3: 2.5 ms per loop

%timeit numpy_append()
# 10 loops, best of 3: 48.4 ms per loop

So you can see that both pre-allocating and using a list then converting are much faster.

score 2 · Answer 2 · answered Sep 03 '14 at 17:00

If you know the size of the array at the end of the run, then it is going to be much faster to pre-allocate an array of the appropriate size and then set the values. If you do need to append on-the fly, it's probably better to try to not do this one element at a time, instead appending as few times as possible to avoid generating many copies over and over again. You might also want to do some profiling of the difference in timings of np.append, np.hstack, np.concatenate. etc.

Optimal way to append to numpy array

2 Answers2

Linked