1

I have a functionality that for every iteration fetches elements and appends it to a list. At the end of certain number of iterations (say 1 million) I want to append the list to a numpy array, then empty the list and continue the process.

I have declared an empty numpy array as

a= np.array([], dtype="int32")

b =[1,2,3,4] is my list for first 1 million iteration, 

b =[5,4,3,2] is the list for second 1 million iteration

how do I keep on appending the list b to the numpy array a after every 1 million iteration.

I need an output as given below

array([[1, 2, 3, 4],
   [5, 4, 3, 2]])

I have tried "concatenate" and "vstack" , but the problem is the dimension for a(when empty) and b doesn't match, so the code gives error.

The list is going to be as big as 1 million, so need an cost effective method to deal with the append. I could also "vstack" elements for each iteration, but that will load the huge list every time I vstack, which would not be cost-efficient. I tried the below code, which is working just fine but I want to avoid the check at every iteration.

if any(a):
    a=np.vstack((a,b))
else:
    a=np.append(a,b, axis=0)

Is there any way I can append a list to a numpy array without performing the check.

Sam
  • 2,545
  • 8
  • 38
  • 59
  • You may find this interesting: http://stackoverflow.com/a/5068182/1716866. – leekaiinthesky May 17 '15 at 09:06
  • 1
    Note that appending to a numpy array is never efficient. Numpy arrays cannot grow dynamically in size, so every concatenation will create a whole new array. If you know the maximum number of iterations you could first initialize the whole matrix and just fill it row by row. A slightly uglier approach would be to store a list of numpy arrays and create the big array in the end. – cel May 17 '15 at 09:06
  • If you want to append to an empty array, you have to specify its dimensions. For instance, `np.empty((0,4))`. – jasaarim May 17 '15 at 12:55

1 Answers1

5

I would recommend not appending to the array as that can be very inefficient. Instead, you could use deque for collecting the lists, and make an array from that only when you need it. Here is an example:

from collections import deque
import numpy as np

lists = deque()
for i in range(1, 13, 4):
     lists.append(range(i, i + 4))

result = np.array(lists)

Now we have

>>> result
array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

deque is a linked list, which means that we don't have to reallocate memory for the whole container once new elements appear.

jasaarim
  • 1,806
  • 15
  • 19