Collapse matrix to vector and replace values with column names

Question

There is a matrix like this:

m <- matrix(c(F,T,F,T,F,T,F,T,F, F,F,T,F,T,F,T,F,T), nrow=9, ncol=2,
           dimnames=list(c(), c('1x2x24','2x2x24')))

    1x2x24 2x2x24
[1,]  FALSE  FALSE
[2,]   TRUE  FALSE
[3,]  FALSE   TRUE
[4,]   TRUE  FALSE
[5,]  FALSE   TRUE
[6,]   TRUE  FALSE
[7,]  FALSE   TRUE
[8,]   TRUE  FALSE
[9,]  FALSE   TRUE

Presumably there is only a single TRUE per row. What would be the best way to get a vector like this:

NA, 1x2x24, 2x2x24, 1x2x24, 2x2x24, 1x2x24, 2x2x24, 1x2x24, 2x2x24

One way to get it is to replace each TRUE in each column with column name and each FALSE with NA or "". Then merge all columns across rows with paste(). I'm not quite sure how to do this. Any help is highly appreciated.

[For each row return the column name of the largest value](https://stackoverflow.com/questions/17735859/for-each-row-return-the-column-name-of-the-largest-value). Use `rowSums` to find rows with two `FALSE`. — Henrik, Dec 12 '18 at 21:29
*"Presumably there is only a single TRUE per row."* Are you trying to reverse one-hotting a categorical or int? — smci, Dec 12 '18 at 21:54

G. Grothendieck · Accepted Answer · 2018-12-12T22:16:19.097

Using the input matrix m defined reproducibly in the Note at the end if we matrix multiply it by 1:2 which is the same as multiplying each row by 1:2 which gives a vector with the same length as the number of rows such that each element is 0 if there are no TRUE values in that row, 1 if there is a TRUE value in the first column of that row and 2 if there is a TRUE value in the second column of that row. Add 1 to that and index into a 3-tuple whose first element is NA and subsequent elements are the column names.

c(NA, colnames(m))[m %*% 1:2 + 1]
## [1] NA       "1x2x24" "2x2x24" "1x2x24" "2x2x24" "1x2x24" "2x2x24" "1x2x24"
## [9] "2x2x24"

Alternately we can use the same computation but use it to define a factor which we convert to a character string:

as.character(factor(m %*% 1:2 + 1, lab = c(NA, colnames(m))))

If it turns out that it is possible to have two TRUE values in a row then for such rows the computation gives 4 so just replace c(NA, colnames(m)) with c(NA, colnames(m), "both"), say.

Note

m <- structure(c(FALSE, TRUE, FALSE, TRUE, FALSE, TRUE, FALSE, TRUE, 
FALSE, FALSE, FALSE, TRUE, FALSE, TRUE, FALSE, TRUE, FALSE, TRUE
), .Dim = c(9L, 2L), .Dimnames = list(c("[1,]", "[2,]", "[3,]", 
"[4,]", "[5,]", "[6,]", "[7,]", "[8,]", "[9,]"), c("1x2x24", 
"2x2x24")))

Use `ncol(m)` instead of 2 if your input matrix does not have 2 columns. — G. Grothendieck, Dec 13 '18 at 22:26

Dylan_Gomes · Answer 2 · 2018-12-12T22:24:34.837

I used the function coalesce from the package dplyr once all the FALSE values were turned into NAs with an ifelse function. This isn't the cleanest way to do it, but it works. For completeness: If dplyr is not yet installed you will need to run install.packages("dplyr") first.

library(dplyr)

X <- matrix(c(F,T,F,T,F,T,F,T,F, F,F,T,F,T,F,T,F,T), nrow=9, ncol=2,
            dimnames=list(c(), c('1x2x24','2x2x24')))
> X
      1x2x24 2x2x24
 [1,]  FALSE  FALSE
 [2,]   TRUE  FALSE
 [3,]  FALSE   TRUE
 [4,]   TRUE  FALSE
 [5,]  FALSE   TRUE
 [6,]   TRUE  FALSE
 [7,]  FALSE   TRUE
 [8,]   TRUE  FALSE
 [9,]  FALSE   TRUE

# Here we can use ifelse to turn F into NA
#> ifelse(X[,1]==F, NA_integer_, colnames(X)[1])
#[1] NA       "1x2x24" NA       "1x2x24" NA       "1x2x24" NA       "1x2x24" NA      

#> ifelse(X[,2]==F, NA_integer_, colnames(X)[2])
#[1] NA       NA       "2x2x24" NA       "2x2x24" NA       "2x2x24" NA       "2x2x24"


y<-as.character(data.frame(ifelse(X[,1]==F, NA_integer_, colnames(X)[1]))[,1])
z<-as.character(data.frame(ifelse(X[,2]==F, NA_integer_, colnames(X)[2]))[,1])

coalesce(y,z)
[1] NA       "1x2x24" "2x2x24" "1x2x24" "2x2x24" "1x2x24" "2x2x24" "1x2x24" "2x2x24"

score 1 · Answer 3 · answered Dec 12 '18 at 22:10

This should work

C1<-c(FALSE,FALSE,FALSE,FALSE,
   TRUE,FALSE,FALSE,TRUE, 
   TRUE,FALSE,FALSE,TRUE,
   TRUE,FALSE,FALSE,TRUE, 
   TRUE,FALSE,FALSE,TRUE)
print(C1)
m<-matrix(C1,ncol=2)
colnames(m)<-c("1x2x24","2x2x24")
vector_result<-(apply(m, 1, 
function(u) paste(names(which(u)), collapse="NA" ) 
))
idx<-(which(vector_result=="")) # replace "" with NA     
vector_result[idx]="NA"
print(vector_result)

dww · Answer 4 · 2018-12-13T14:01:02.387

1

We can use

apply(m,1,function(x) names(which(x))[1])
# [1,]     [2,]     [3,]     [4,]     [5,]     [6,]     [7,]     [8,]     [9,]  
#   NA "1x2x24" "2x2x24" "1x2x24" "2x2x24" "1x2x24" "2x2x24" "1x2x24" "2x2x24"

edited Dec 13 '18 at 14:01

answered Dec 12 '18 at 22:10

dww

30,425
5
68
111

@G.Grothendieck, if you could parametarize your answer, such as to remove hard-coded variables (i.e. 1:2 + 1) then I think it would be comparable to this one. – Dimon Dec 13 '18 at 18:01

Collapse matrix to vector and replace values with column names

4 Answers4

Note