180

Is there way to initialize a numpy array of a shape and add to it? I will explain what I need with a list example. If I want to create a list of objects generated in a loop, I can do:

a = []
for i in range(5):
    a.append(i)

I want to do something similar with a numpy array. I know about vstack, concatenate etc. However, it seems these require two numpy arrays as inputs. What I need is:

big_array # Initially empty. This is where I don't know what to specify
for i in range(5):
    array i of shape = (2,4) created.
    add to big_array

The big_array should have a shape (10,4). How to do this?


EDIT:

I want to add the following clarification. I am aware that I can define big_array = numpy.zeros((10,4)) and then fill it up. However, this requires specifying the size of big_array in advance. I know the size in this case, but what if I do not? When we use the .append function for extending the list in python, we don't need to know its final size in advance. I am wondering if something similar exists for creating a bigger array from smaller arrays, starting with an empty array.

Blaszard
  • 30,954
  • 51
  • 153
  • 233
Curious2learn
  • 31,692
  • 43
  • 108
  • 125
  • Incidentally your first code sample can be written neatly and succinctly as a list comprehension: `[i for i in range(5)]`. (Equivalently: `list(range(5))`, though this is a contrived example.) – Katriel Dec 26 '10 at 21:06
  • 1
    what solution worked for you? i'm trying to do something similar like `x = numpy.array()` just the way we would do to a list like `y = []` ; but it didn't work – kRazzy R Feb 14 '18 at 20:14

14 Answers14

210

numpy.zeros

Return a new array of given shape and type, filled with zeros.

or

numpy.ones

Return a new array of given shape and type, filled with ones.

or

numpy.empty

Return a new array of given shape and type, without initializing entries.


However, the mentality in which we construct an array by appending elements to a list is not much used in numpy, because it's less efficient (numpy datatypes are much closer to the underlying C arrays). Instead, you should preallocate the array to the size that you need it to be, and then fill in the rows. You can use numpy.append if you must, though.

ryanjdillon
  • 17,658
  • 9
  • 85
  • 110
Katriel
  • 120,462
  • 19
  • 136
  • 170
  • 2
    I know that I can set big_array = numpy.zeros and then fill it with the small arrays created. This, however, requires me to specify the size of big_array in advance. Is there nothing like .append of the list function where I don't have the specify the size in advance. Thanks! – Curious2learn Dec 26 '10 at 21:34
  • 2
    @Curious2learn. No, there is nothing quite like append in Numpy. There are functions that concatenate arrays or stack them by making new arrays, but they do not do so by appending. This is because of the way that the data structures are set-up. Numpy arrays are made to be fast by virtue of being able to more compactly store values, but they need to be have fixed size to obtain this speed. Python lists are designed to be more flexible at the cost of speed and size. – Justin Peel Dec 26 '10 at 22:10
  • 3
    @Curious: well, there is an `append` in numpy. It's just that it's less efficient not to preallocate (in this case, much less efficient, since `append`ing copies the entire array each time), so it's not a standard technique. – Katriel Dec 26 '10 at 22:50
  • 1
    What if only part of the `np.empty` array is filled by values? What about the remaining "empty" items? – wsdzbm Jul 26 '16 at 16:36
  • 1
    If you know only know the width (e.g. needed for `np.concatenate()`), you can initialize with: `np.empty((0, some_width))`. 0, so your first array won't be garbage. – NumesSanguis Sep 01 '17 at 05:59
  • 1
    At least when I'm trying it, `np.empty` is not actually empty but holds some values which I guess whatever is in memory at the space to which the array was allocated. – Yoav Vollansky May 05 '20 at 17:26
51

The way I usually do that is by creating a regular list, then append my stuff into it, and finally transform the list to a numpy array as follows :

import numpy as np
big_array = [] #  empty regular list
for i in range(5):
    arr = i*np.ones((2,4)) # for instance
    big_array.append(arr)
big_np_array = np.array(big_array)  # transformed to a numpy array

of course your final object takes twice the space in the memory at the creation step, but appending on python list is very fast, and creation using np.array() also.

mad7777
  • 814
  • 6
  • 5
  • 16
    **This is not the way to go if you know the size of the array ahead of time**, however... I end up using this method frequently when I don't know how big the array will end up being. For example, when reading data from a file or another process. It isn't really as awful as it may seem at first since python and numpy are pretty clever. – travc Feb 09 '13 at 22:55
33

Introduced in numpy 1.8:

numpy.full

Return a new array of given shape and type, filled with fill_value.

Examples:

>>> import numpy as np
>>> np.full((2, 2), np.inf)
array([[ inf,  inf],
       [ inf,  inf]])
>>> np.full((2, 2), 10)
array([[10, 10],
       [10, 10]])
Franck Dernoncourt
  • 77,520
  • 72
  • 342
  • 501
17

Array analogue for the python's

a = []
for i in range(5):
    a.append(i)

is:

import numpy as np

a = np.empty((0))
for i in range(5):
    a = np.append(a, i)
Rosa Alejandra
  • 732
  • 5
  • 21
Adobe
  • 12,967
  • 10
  • 85
  • 126
8

You do want to avoid explicit loops as much as possible when doing array computing, as that reduces the speed gain from that form of computing. There are multiple ways to initialize a numpy array. If you want it filled with zeros, do as katrielalex said:

big_array = numpy.zeros((10,4))

EDIT: What sort of sequence is it you're making? You should check out the different numpy functions that create arrays, like numpy.linspace(start, stop, size) (equally spaced number), or numpy.arange(start, stop, inc). Where possible, these functions will make arrays substantially faster than doing the same work in explicit loops

Andreas Løve Selvik
  • 1,262
  • 16
  • 25
8

To initialize a numpy array with a specific matrix:

import numpy as np

mat = np.array([[1, 1, 0, 0, 0],
                [0, 1, 0, 0, 1],
                [1, 0, 0, 1, 1],
                [0, 0, 0, 0, 0],
                [1, 0, 1, 0, 1]])

print mat.shape
print mat

output:

(5, 5)
[[1 1 0 0 0]
 [0 1 0 0 1]
 [1 0 0 1 1]
 [0 0 0 0 0]
 [1 0 1 0 1]]
JayS
  • 2,057
  • 24
  • 16
7

numpy.fromiter() is what you are looking for:

big_array = numpy.fromiter(xrange(5), dtype="int")

It also works with generator expressions, e.g.:

big_array = numpy.fromiter( (i*(i+1)/2 for i in xrange(5)), dtype="int" )

If you know the length of the array in advance, you can specify it with an optional 'count' argument.

Quant Metropolis
  • 2,602
  • 2
  • 17
  • 16
  • 3
    I actually ran timeit, and I think that np.fromiter() might be slower than np.array(). timeit("np.array(i for i in xrange(100))", setup="import numpy as np", number = 10000) -> 0.02539992332458496, versus timeit("np.fromiter((i for i in xrange(100)), dtype=int)", setup="import numpy as np", number = 10000) -> 0.13351011276245117 – hlin117 Oct 07 '14 at 17:50
7

For your first array example use,

a = numpy.arange(5)

To initialize big_array, use

big_array = numpy.zeros((10,4))

This assumes you want to initialize with zeros, which is pretty typical, but there are many other ways to initialize an array in numpy.

Edit: If you don't know the size of big_array in advance, it's generally best to first build a Python list using append, and when you have everything collected in the list, convert this list to a numpy array using numpy.array(mylist). The reason for this is that lists are meant to grow very efficiently and quickly, whereas numpy.concatenate would be very inefficient since numpy arrays don't change size easily. But once everything is collected in a list, and you know the final array size, a numpy array can be efficiently constructed.

tom10
  • 67,082
  • 10
  • 127
  • 137
4

I realize that this is a bit late, but I did not notice any of the other answers mentioning indexing into the empty array:

big_array = numpy.empty(10, 4)
for i in range(5):
    array_i = numpy.random.random(2, 4)
    big_array[2 * i:2 * (i + 1), :] = array_i

This way, you preallocate the entire result array with numpy.empty and fill in the rows as you go using indexed assignment.

It is perfectly safe to preallocate with empty instead of zeros in the example you gave since you are guaranteeing that the entire array will be filled with the chunks you generate.

Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
3

I'd suggest defining shape first. Then iterate over it to insert values.

big_array= np.zeros(shape = ( 6, 2 ))
for it in range(6):
    big_array[it] = (it,it) # For example

>>>big_array

array([[ 0.,  0.],
       [ 1.,  1.],
       [ 2.,  2.],
       [ 3.,  3.],
       [ 4.,  4.],
       [ 5.,  5.]])
GT GT
  • 73
  • 4
3

Whenever you are in the following situation:

a = []
for i in range(5):
    a.append(i)

and you want something similar in numpy, several previous answers have pointed out ways to do it, but as @katrielalex pointed out these methods are not efficient. The efficient way to do this is to build a long list and then reshape it the way you want after you have a long list. For example, let's say I am reading some lines from a file and each row has a list of numbers and I want to build a numpy array of shape (number of lines read, length of vector in each row). Here is how I would do it more efficiently:

long_list = []
counter = 0
with open('filename', 'r') as f:
    for row in f:
        row_list = row.split()
        long_list.extend(row_list)
        counter++
#  now we have a long list and we are ready to reshape
result = np.array(long_list).reshape(counter, len(row_list)) #  desired numpy array
Heapify
  • 2,581
  • 17
  • 17
2

Maybe something like this will fit your needs..

import numpy as np

N = 5
res = []

for i in range(N):
    res.append(np.cumsum(np.ones(shape=(2,4))))

res = np.array(res).reshape((10, 4))
print(res)

Which produces the following output

[[ 1.  2.  3.  4.]
 [ 5.  6.  7.  8.]
 [ 1.  2.  3.  4.]
 [ 5.  6.  7.  8.]
 [ 1.  2.  3.  4.]
 [ 5.  6.  7.  8.]
 [ 1.  2.  3.  4.]
 [ 5.  6.  7.  8.]
 [ 1.  2.  3.  4.]
 [ 5.  6.  7.  8.]]
0

If you want to add your item in multi-dimensional array, here is the solution.

import numpy as np
big_array = np.ndarray(shape=(0, 2, 4) # Empty with height and width 2, 4 and length 0

for i in range(5):
    big_array = np.concatenate((big_array, i))

Here is the numpy official document for referral

Brady Huang
  • 1,852
  • 20
  • 23
0
# https://thispointer.com/create-an-empty-2d-numpy-array-matrix-and-append-rows-or-columns-in-python/

# Create an empty Numpy array with 4 columns or 0 rows
empty_array = np.empty((0, 4), int)

# Append a row to the 2D numpy array
empty_array = np.append(empty_array, np.array([[11, 21, 31, 41]]), axis=0)
# Append 2nd rows to the 2D Numpy array
empty_array = np.append(empty_array, np.array([[15, 25, 35, 45]]), axis=0)
print('2D Numpy array:')
print(empty_array)

pay attention that each inputed np.array is 2-dimensional

JeeyCi
  • 354
  • 2
  • 9