vectorize array: construct matrix with 1 in specified places and 0 elsewhere

Question

I have a (1-dimensional) numpy array a of length L, filled with numbers from 0 to N-1.
Now, I want to construct a NxL matrix such that in each column c, the a[c]'th entry is 1 and all other entries are 0.

For example, If L=4, N=5 and

a = np.array([1,2,0,4])

then we'd want a matrix

m = np.array([[0,0,1,0],
              [1,0,0,0],
              [0,1,0,0],
              [0,0,0,0],
              [0,0,0,1]])

Now, I have the following code:

def vectorize(a, L, N):
    m = np.zeros((N, L))
    for (i,x) in enumerate(a):
        m[x][i] = 1.0

    return m

This works fine, but I'm sure there is a faster method using some numpy trick (that avoids looping over a).

The second linked duplicate is a good resource on one-hot encoding, I think `np.eye(a.max()+1)[a]` is a clean approach — user3483203, Sep 07 '19 at 17:55

score 3 · Answer 1 · answered Sep 07 '19 at 16:15

When you use an array of integers as an index, you need other arrays that broadcast to the same shape to indicate the placement in the other dimensions. In your case, each element of a is a row index. The corresponding column is:

b = np.arange(L)

Now you can index directly into the matrix m:

m = np.zeros((N, L), dtype=bool)
m[a, b] = True

When you index a numpy array, you should use all the indices in a single bracket operator, rather than separate operators like m[a][b]. m[a] is a copy of the portion of m when a is an array of integers, but a view of the original data when a is a single integer, which is the only reason your example works.

score 2 · Accepted Answer · answered Sep 07 '19 at 16:14

You can use an np.arange(..) for the second axis:

def vectorize(a, L, N):
    m = np.zeros((N, L), int)
    m[a, np.arange(len(a))] = 1
    return m

So for the given sample input, we get:

>>> a = np.array([1,2,0,4])
>>> vectorize(a, 4, 5)
array([[0, 0, 1, 0],
       [1, 0, 0, 0],
       [0, 1, 0, 0],
       [0, 0, 0, 0],
       [0, 0, 0, 1]])

score 2 · Answer 3 · answered Sep 07 '19 at 16:15

2

def vectorize(a, L, N):
    m = np.zeros((N, L))
    m[a,np.arange(L)] =1
    return m

answered Sep 07 '19 at 16:15

one

2,205
1
15
37

vectorize array: construct matrix with 1 in specified places and 0 elsewhere

3 Answers3