4

I would like to replace the first x values in every row of my array a with ones and to keep all the other values NaN. The first x values however changes in every row and is determined by a list b.

Since I'm not very familiar with arrays I thought I might do this with a for loop as shown below, but this doesn't work (I've got inspiration for the basics of replacement in arrays from How to set first N elements of array to zero?).

In:
a = np.empty((3,4))
a.fill(np.nan)
b = [2,3,1]

for i in range(b):
    a[0:b[i]] = [1] * b[i] 
    a[i:] = np.ones((b[i]))
pass

Out:
line 7:
ValueError: could not broadcast input array from shape (2) into shape (2,4)

Result should be like:

Out:
[[1, 1, nan, nan], 
 [1, 1, 1, nan], 
 [1, nan, nan, nan]] 
S Verhoef
  • 91
  • 1
  • 2
  • 11
  • 1
    When you post code like this, you should say more than 'it doesn't work'. What was wrong? What kind of error? A minor tweak gets it to work: `for i in range(3): a[i,0:b[i]] = 1` You didn't fully apply the linked answer. – hpaulj Aug 11 '17 at 17:05
  • @hpaulj You're right, i've added it now for completeness. If i use your tweak an error caused by the last line appears `(ValueError: could not broadcast input array from shape (2) into shape (3,4))`. If i delete this line it's all fine indeed! – S Verhoef Aug 14 '17 at 10:28
  • @hpaulj my bad, it doesn't work than. The whole array is than filled with ones. However your solution with enumerate works perfect – S Verhoef Aug 14 '17 at 10:39

4 Answers4

4

In the linked answer, How to set first N elements of array to zero?

the solution for arrays is

y = numpy.array(x)
y[0:n] = 0

In other words if we are filling a slice (range of indices) with the same number we can specify a scalar. It could be an array of the same size, e.g. np.ones(n). But it doesn't have to be.

So we just need to iterate over the rows of a (and elements of b) and perform this indexed assignment

In [368]: a = np.ones((3,4))*np.nan
In [369]: for i in range(3):
     ...:     a[i,:b[i]] = 1
     ...:     
In [370]: a
Out[370]: 
array([[  1.,   1.,  nan,  nan],
       [  1.,   1.,   1.,  nan],
       [  1.,  nan,  nan,  nan]])

There are various ways of 'filling' the original array with nan. np.full does an np.empty followed by a copyto.

A variation on the row iteration is with for i,n in enumerate(a):.

Another good way of iterating in a coordinated sense is with zip.

In [371]: for i,x in zip(b,a):
     ...:     x[:i] = 1

This takes advantage of the fact that iteration on a 2d array iterates on its rows. So x is an 1d view of a and can be changed in-place.

But with a bit of indexing trickery, we don't even have to loop.

In [373]: a = np.full((3,4),np.nan)

In [375]: mask = np.array(b)[:,None]>np.arange(4)
In [376]: mask
Out[376]: 
array([[ True,  True, False, False],
       [ True,  True,  True, False],
       [ True, False, False, False]], dtype=bool)
In [377]: a[mask] = 1
In [378]: a
Out[378]: 
array([[  1.,   1.,  nan,  nan],
       [  1.,   1.,   1.,  nan],
       [  1.,  nan,  nan,  nan]])

This is a favorite of one of the top numpy posters, @Divakar.

Numpy: Fix array with rows of different lengths by filling the empty elements with zeros

It can be used to pad a list of lists. Speaking of padding, itertools has a handy tool, zip_longest (py3 name)

In [380]: np.array(list(itertools.zip_longest(*[np.ones(x).tolist() for x in b],fillvalue=np.nan))).T
Out[380]: 
array([[  1.,   1.,  nan],
       [  1.,   1.,   1.],
       [  1.,  nan,  nan]])

Your question should have specified what was wrong; what kinds of errors you got:

for i in w2:
    a[0:b[i]] = [1] * b[i] 
    a[i:] = np.ones((b[i]))

w2 is unspecified, but probably is range(3).

a[0:b[i]] is wrong because it specifies all rows, where as you are working on just one at a time. a[i:] specifies a range of rows as well.

hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • Nice answer! I guess the solution involving masks is the fastest. The line `mask = np.array(b)[:,None] > np.arange(4)` is pure genius :) – Kruupös Aug 11 '17 at 17:54
  • Thanks a lot, nice explanaiton :) For the completeness I'll replace w2 and write down the errors I got. – S Verhoef Aug 14 '17 at 10:14
2

You can do this via a loop. Initialize an array of nan values then loop through the list of first n's and set values to 1 according to the n for each row.

a = np.full((3, 4), np.nan)
b = [2, 3, 1]
for i, x in enumerate(b):
    a[i, :x] = 1
vielkind
  • 2,840
  • 1
  • 16
  • 16
  • 3
    You can use [`numpy.full`](https://docs.scipy.org/doc/numpy/reference/generated/numpy.full.html#numpy-full) to init your matrix with `np.NAN` values rather than zeros. `a = np.full((3, 4), np.NAN)` Otherwise nice solution! – Kruupös Aug 11 '17 at 15:16
  • Does its matter if I start with making my array all NaN, so leaving out your last line, compared to making them Zero first and than (with your last line) make all leftovers zero? Maybe speed? – S Verhoef Aug 11 '17 at 15:29
  • 1
    @SVerhoef, mostly **readability**, imo most important than speed sometimes. – Kruupös Aug 11 '17 at 15:35
  • Answer has been updated to use 'numpy.full' as suggested @OwlMax as its a much more readable solution. Thanks for the suggestion! – vielkind Aug 11 '17 at 15:48
  • The distinction between `full` and `fill` is nit picky. Look at the code for `np.full`. The real question is - can we avoid the iteration? So far all the answers are minor variations on the original. – hpaulj Aug 11 '17 at 17:00
1

You can initialise you matrix using a list comprehension:

>>> import numpy as np
>>> b = [2, 3, 1]
>>> max_len = 4
>>> gen_array = lambda i: [1] * i + [np.NAN] * (max_len - i)
>>> np.matrix([gen_array(i) for i in b])

With detailed steps:

[1] * N will create an array of length N filled with 1:

>>> [1] * 3
[1, 1, 1]

You can concat array using +:

>>> [1, 2] + [3, 4]
[1, 2, 3, 4]

You just have to combine both [1] * X + [np.NAN] * (N - X) will create an array of N dimension filled with X 1

last one, list-comprehension:

[i for i in b]

is a "shortcut" (not really, but it is easier to understand) for:

a = []
for i in b:
    a.append(i)
Kruupös
  • 5,097
  • 3
  • 27
  • 43
0
import numpy as np
a = np.random.rand(3,4)  #Create matrix with random numbers (you can change this to np.empty or whatever you want.
b = [1, 2, 3] # Your 'b' list
for idr, row in enumerate(a): # Loop through the matrix by row
  a[idr,:b[idr]] = 1  # idr is the row index, here you change the row 'idr' from the column 0 to the column b[idr] that will be 0, 1 and 3
  a[idr,b[idr]:] = 'NaN'  # the values after that receive NaN
print(a) # Outputs matrix
#[[  1.  nan  nan  nan]
 [  1.   1.  nan  nan]
 [  1.   1.   1.  nan]]
Gabriel Belini
  • 760
  • 1
  • 13
  • 32
  • This is an overly complex duplicate of an answer already posted – Brad Solomon Aug 11 '17 at 14:24
  • The answer was posted 13 minutes ago, imo it's ok to downvote duplicate answers of things that are already settled but when I started doing this the other answer wasn't even posted yet and I just saw it now. – Gabriel Belini Aug 11 '17 at 14:27