Turn numeric vector into boolean matrix

Question

I have a column vector in a dataframe and would like to turn it into a binary matrix so I can do matrix multiplication with it later on.

y_labels
1
4
4
3

desired output

In Octave I would do something like y_matrix = (y_labels == [1 2 3 4]). However, I can't figure out how to get this in R. Anybody know how?

Try `dummy::dummy(data.frame(myF = factor(df1$y_labels, levels = 1:4)))` — zx8754, Feb 05 '18 at 11:14
did you mean to write dummies::dummy(data.frame(myF = factor(df1$y_labels, levels = 1:4)))? — zipline86, Feb 08 '18 at 07:57
They are different packages, both should give similar results. — zx8754, Feb 08 '18 at 07:58

akrun · Answer 1 · 2018-02-05T10:32:32.633

3

We can use model.matrix to change it to binary

model.matrix(~ -1 + factor(y_labels, levels = 1:4), df1)

or with table

with(df1, table(1:nrow(df1), factor(y_labels, levels = 1:4)))
#    1 2 3 4
#  1 1 0 0 0
#  2 0 0 0 1
#  3 0 0 0 1
#  4 0 0 1 0

Or more compactly

+(sapply(1:4, `==`, df1$y_labels))
#      [,1] [,2] [,3] [,4]
#[1,]    1    0    0    0
#[2,]    0    0    0    1
#[3,]    0    0    0    1
#[4,]    0    0    1    0

edited Feb 05 '18 at 10:32

answered Feb 05 '18 at 10:18

akrun

874,273
37
540
662

Great thanks! What does ~ -1 do? I got kinda confused there. – zipline86 Feb 07 '18 at 22:45
@zipline86 It is to remove the intercept column – akrun Feb 08 '18 at 02:32
1

great thanks for the help! – zipline86 Feb 08 '18 at 07:58

989 · Answer 2 · 2018-02-05T11:16:28.553

2

How about (where vec is your numeric vector):

m <- matrix(0, length(vec), max(vec))
m[cbind(seq_along(vec), vec)] <- 1

#    [,1] [,2] [,3] [,4]
#[1,]    1    0    0    0
#[2,]    0    0    0    1
#[3,]    0    0    0    1
#[4,]    0    0    1    0

edited Feb 05 '18 at 11:16

answered Feb 05 '18 at 10:23

989

12,579
5
31
53

1

Not sure if it's a use case for OP, but matrix might not be square. – moodymudskipper Feb 05 '18 at 11:10
just a side note, op states that their input is a data.frame, not an atomic vector. – talat Feb 05 '18 at 11:32
@docendodiscimus yes but `df$y_labels` in place of `vec`. The title says a numeric vector though. – 989 Feb 05 '18 at 12:17

score 2 · Accepted Answer · answered Feb 05 '18 at 10:27

2

Here's another option:

Start by creating a matrix of zeros:

m <- matrix(0, nrow = nrow(df), ncol = max(df$y_labels))

Then insert 1s at the correct positions:

m[col(m) == df$y_labels] <- 1

The result is:

     [,1] [,2] [,3] [,4]
[1,]    1    0    0    0
[2,]    0    0    0    1
[3,]    0    0    0    1
[4,]    0    0    1    0

answered Feb 05 '18 at 10:27

talat

68,970
21
126
157

ah ok cool, it's a little bit similar to the way I would do it in octave. Thanks! – zipline86 Feb 07 '18 at 22:44

moodymudskipper · Answer 4 · 2018-02-05T11:37:12.903

1

In base R:

df1 <- data.frame(y_labels = c(1,4,4,3))
t(sapply(df1$y_labels,function(x) c(rep(0,x-1),1,rep(0,max(df1$y_labels)-x))))

or

t(sapply(df1$y_labels,function(x) `[<-`(numeric(max(df1$y_labels)),x,1)))

output:

#      [,1] [,2] [,3] [,4]
# [1,]    1    0    0    0
# [2,]    0    0    0    1
# [3,]    0    0    0    1
# [4,]    0    0    1    0

edited Feb 05 '18 at 11:37

answered Feb 05 '18 at 11:08

moodymudskipper

46,417
11
121
167

Turn numeric vector into boolean matrix

4 Answers4