How does this one-hot vector conversion work?

Question

When I was working on my machine learning project, I was looking for a line of code to turn my labels into one-hot vectors. I came across this nifty line of code from u/benanne on Reddit.

np.eye(n_labels)[target_vector]

For example, for a target_vector = np.array([1, 4, 2, 1, 0, 1, 3, 2]), it returns the one-hot coded values:

np.eye(5)[target_vector]
Out: 
array([[ 0.,  1.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  1.],
       [ 0.,  0.,  1.,  0.,  0.],
       ..., 
       [ 0.,  1.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  1.,  0.],
       [ 0.,  0.,  1.,  0.,  0.]])

While it definitely does work, I'm not sure how or why it works.

It's the second answer on the [top voted SO question.](https://stackoverflow.com/questions/29831489/numpy-1-hot-array). Should probably be the accepted answer since it handles n-d 1HA's so well — Daniel F, Jul 13 '17 at 06:58

score 9 · Answer 1 · answered Jul 12 '17 at 23:10

9

It's rather simple. np.eye(n_labels) creates an identity matrix of size n_labels then you use your target_vector to select rows, corresponding to the value of the current target, from that matrix. Since each row in an identity matrix contains exactly one 1 element and the rest 0, each row will be a valid 'one hot coding'.

answered Jul 12 '17 at 23:10

JohanL

6,671
1
12
26

This is indexing a NumPy array using another array as described here: https://docs.scipy.org/doc/numpy/user/basics.indexing.html#index-arrays. The first array is the Identity matrix of size _n_labels_. The second array selects the one-hot row corresponding to each target. – zardosht Oct 11 '18 at 10:04

score 0 · Answer 2 · answered Feb 13 '18 at 21:19

0

ndarray[[0]] is to select the first line in the ndarray

t = np.arange(9).reshape(3,3)
print t
print t[[1]]

Output is

[[0 1 2]
 [3 4 5]
 [6 7 8]]
[[3 4 5]]

answered Feb 13 '18 at 21:19

Jason

3,166
3
20
37

How does this one-hot vector conversion work?

2 Answers2

Linked

Related