3

I have a column vector in a dataframe and would like to turn it into a binary matrix so I can do matrix multiplication with it later on.

y_labels
1
4
4
3

desired output

1 0 0 0
0 0 0 1
0 0 0 1
0 0 1 0

In Octave I would do something like y_matrix = (y_labels == [1 2 3 4]). However, I can't figure out how to get this in R. Anybody know how?

zipline86
  • 561
  • 2
  • 7
  • 21

4 Answers4

3

We can use model.matrix to change it to binary

model.matrix(~ -1 + factor(y_labels, levels = 1:4), df1)

or with table

with(df1, table(1:nrow(df1), factor(y_labels, levels = 1:4)))
#    1 2 3 4
#  1 1 0 0 0
#  2 0 0 0 1
#  3 0 0 0 1
#  4 0 0 1 0

Or more compactly

+(sapply(1:4, `==`, df1$y_labels))
#      [,1] [,2] [,3] [,4]
#[1,]    1    0    0    0
#[2,]    0    0    0    1
#[3,]    0    0    0    1
#[4,]    0    0    1    0
akrun
  • 874,273
  • 37
  • 540
  • 662
2

How about (where vec is your numeric vector):

m <- matrix(0, length(vec), max(vec))
m[cbind(seq_along(vec), vec)] <- 1

#    [,1] [,2] [,3] [,4]
#[1,]    1    0    0    0
#[2,]    0    0    0    1
#[3,]    0    0    0    1
#[4,]    0    0    1    0
989
  • 12,579
  • 5
  • 31
  • 53
2

Here's another option:

Start by creating a matrix of zeros:

m <- matrix(0, nrow = nrow(df), ncol = max(df$y_labels))

Then insert 1s at the correct positions:

m[col(m) == df$y_labels] <- 1

The result is:

     [,1] [,2] [,3] [,4]
[1,]    1    0    0    0
[2,]    0    0    0    1
[3,]    0    0    0    1
[4,]    0    0    1    0
talat
  • 68,970
  • 21
  • 126
  • 157
1

In base R:

df1 <- data.frame(y_labels = c(1,4,4,3))
t(sapply(df1$y_labels,function(x) c(rep(0,x-1),1,rep(0,max(df1$y_labels)-x))))

or

t(sapply(df1$y_labels,function(x) `[<-`(numeric(max(df1$y_labels)),x,1)))

output:

#      [,1] [,2] [,3] [,4]
# [1,]    1    0    0    0
# [2,]    0    0    0    1
# [3,]    0    0    0    1
# [4,]    0    0    1    0
moodymudskipper
  • 46,417
  • 11
  • 121
  • 167