1

I have a matrix with binary data representing whether each column field is relevant to each row element. I'm looking to create a two column dataframe identifying the name of each field associated with each row. How can I do this in R?

Here is an example of what I'm starting with:

   A B C
 W 1 1 0
 X 0 1 1
 Y 1 1 1
 Z 0 1 1

And I'm looking to end up with this:

Element | Relevant Field
       W|A
       W|B
       X|B
       X|C
       Y|A
       Y|B
       Y|C
       Z|B
       Z|C

Any hints? Thanks!

M--
  • 25,431
  • 8
  • 61
  • 93
user3786999
  • 1,037
  • 3
  • 13
  • 24
  • What type of object is your starting value? A named matrix? A proper [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) where we could paste the value into R would be helpful. – MrFlick Apr 03 '17 at 18:21

3 Answers3

4

If your starting value is a matrix like this

mm <- matrix(c(1L, 0L, 1L, 0L, 1L, 1L, 1L, 1L, 0L, 1L, 1L, 1L), 
  ncol=3, dimnames = list(c("W", "X", "Y", "Z"), c("A", "B", "C")))

You can treat it like a table and unroll the data faairly easily

subset(as.data.frame(as.table(mm)), Freq>0)
#    Var1 Var2 Freq
# 1     W    A    1
# 3     Y    A    1
# 5     W    B    1
# 6     X    B    1
# 7     Y    B    1
# 8     Z    B    1
# 10    X    C    1
# 11    Y    C    1
# 12    Z    C    1
MrFlick
  • 195,160
  • 17
  • 277
  • 295
3

We can use base R methods

data.frame(Element = rep(rownames(m1), each = ncol(m1)),
    Relevant_Field = rep(colnames(m1), nrow(m1)))[as.vector(t(m1))!=0,]

Or with CJ

library(data.table)
CJ(Element = row.names(m1), Relevant_Field = colnames(m1))[as.vector(t(m1)!=0)]
#    Element Relevant_Field
#1:       W              A
#2:       W              B
#3:       X              B
#4:       X              C
#5:       Y              A
#6:       Y              B
#7:       Y              C
#8:       Z              B
#9:       Z              C

Or as @Frank suggested, we can melt (using reshape2) to a three column dataset, convert to data.table and remove the 0 values

library(reshape2)
setDT(melt(m1))[ value == 1 ][, value := NULL][]
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    The data.table way would probably be to call reshape2::melt: `setDT(melt(mm))[ value == 1 ]` – Frank Apr 03 '17 at 18:36
2

Here is another base R method that uses with and subsetting.

# get the positions of 1s in matrix (row / column) output
posMat <- which(mm==1, arr.ind=TRUE)

# build the data.frame
myDf <- data.frame(rowVals=rownames(mm)[posMat[, 1]],
                  colVals=colnames(mm)[posMat[, 2]])

or other structures...

# matrix
myMat <- cbind(rowVals=rownames(mm)[posMat[, 1]],
               colVals=colnames(mm)[posMat[, 2]])

# vector with pipe separator
myVec <- paste(rownames(mm)[posMat[, 1]], colnames(mm)[posMat[, 2]], sep="|")
lmo
  • 37,904
  • 9
  • 56
  • 69