I have a problem where I use a lot of one dimensional arrays and time complexity is really important. My fist thought was that Numpy would be the best option in every case but after testing to append and assign to a Numpy array compared to the built in list function with the following code:
import time
import numpy as np
list_size = 10000
iterations = 1000
start_time = time.time()
for i in range(iterations):
numpy_arr_append = np.array([])
for i in range(list_size):
numpy_arr_append = np.append(numpy_arr_append, i)
avg_np_append = (time.time()-start_time)/iterations
print('Average time for numpy array append:', avg_np_append)
start_time = time.time()
for i in range(iterations):
arr_append = []
for i in range(list_size):
arr_append.append(i)
avg_list_append = (time.time()-start_time)/iterations
print('Average time for regular list append:', avg_list_append)
print('Regular list is:', avg_np_append/avg_list_append, 'times faster.')
print('---------------------------------------------------')
start_time = time.time()
for i in range(iterations):
numpy_arr_assign = np.zeros(list_size)
for i in range(list_size):
numpy_arr_assign[i] = i
avg_np_assign = (time.time()-start_time)/iterations
print('Average time for numpy array assign:', avg_np_assign)
start_time = time.time()
for i in range(iterations):
list_assign = [0] * list_size
for i in range(list_size):
list_assign[i] = i
avg_list_assign = (time.time()-start_time)/iterations
print('Average time for regular list assign:', avg_list_assign)
print('Regular list is:', avg_np_assign/avg_list_assign, 'times faster.')
I get the following result:
Average time for numpy array append: 0.06636584520339966
Average time for regular list append: 0.0010602495670318604
Regular list is: 62.59455062942306 times faster.
---------------------------------------------------
Average time for numpy array assign: 0.0017054696083068847
Average time for regular list assign: 0.0009176504611968995
Regular list is: 1.8585176823018494 times faster.
So my question is, what is the best practice when using large arrays and adding elements multiple times. From my testing the built-in function was always better and there was hardly any difference between appending to it or initializing it and then assigning values. Is there something I am missing or is there another better/faster option?
Best regards, Jakob