Merge vectors with different length into a matrix one after the other

Question

Let's assume that we have the following three vectors.

L = c("1", "3")
K = c("2", "9", "2:9")
S = c("7")

Is there any way to combine them into a matrix that will look like the above ?

L    K    S
1    0    0
3    0    0
0    2    0
0    9    0
0    2:9  0
0    0    7

Thank you.

Lamia · Answer 1 · 2017-07-09T20:01:20.573

4

Here's another way to do it: you first create a matrix of 0s, and then using two vectors of indexes k and 1:l, you input your values into it at the right locations.

l = length(c(L,K,S))
k = rep(1:3,times=c(length(L),length(K),length(S)))
m = matrix(0,ncol=3,nrow=l)
m[cbind(1:l,k)] = c(L,K,S)
     [,1] [,2]  [,3]
[1,] "1"  "0"   "0" 
[2,] "3"  "0"   "0" 
[3,] "0"  "2"   "0" 
[4,] "0"  "9"   "0" 
[5,] "0"  "2:9" "0" 
[6,] "0"  "0"   "7"

Edit: For a version that generalizes better to a larger number of input vectors, as per @DavidArenburg's comment, you could do:

l = list(L,K,S)
len = length(unlist(l))
k = rep(seq_along(l), lengths(l))
m = matrix(0, nrow=len, ncol=length(l))
m[cbind(1:len, k)] = unlist(l)

edited Jul 09 '17 at 20:01

answered Jul 09 '17 at 19:38

Lamia

3,845
1
12
19

@DavidArenburg I agree, your version generalizes better for a larger number of input vectors. I'll edit my answer to include your comments. – Lamia Jul 09 '17 at 19:57
@all. Thank you very much but should I prefer one of them over the others by efficiency and time consuming perspective ? – J. Doe Jul 09 '17 at 20:45

Sotos · Accepted Answer · 2017-07-11T07:27:49.160

Here is an idea. First we create a list with all the vectors, and then a matrix with number of rows equal to the sum of all elements of the vectors, and number of columns equal to the number of vectors. We then use mapply to match the first vector's elements (of our list) with the first column of the matrix. Same with second and the third vectors and columns respectively. We then use that ( a logical matrix) to convert all the remaining unmatched elements of the matrix to 0.

l1 <- list(L, K, S)
m1 <- matrix(unlist(l1), nrow = sum(lengths(l1)), ncol = length(l1))
m1[!mapply(`%in%`, as.data.frame(m1), l1)] <- 0

m1
#     [,1] [,2]  [,3]
#[1,] "1"  "0"   "0" 
#[2,] "3"  "0"   "0" 
#[3,] "0"  "2"   "0" 
#[4,] "0"  "9"   "0" 
#[5,] "0"  "2:9" "0" 
#[6,] "0"  "0"   "7"

To address your comment and upgrade this to also work with same values appearing in multiple vectors, we follow the same logic but we do it on the indices of each column of the matrix based on the cumulative sequence of each vector. Since this got a bit more complicated, we can put it all in a function that accepts a list of the vectors as input i.e.

create_mat <- function(list){
  m1 <- matrix(unlist(list), nrow = sum(lengths(list)), ncol = length(list))
  m2 <- matrix(seq(nrow(m1)), ncol = ncol(m1), nrow = nrow(m1))
  l2 <- lapply(lengths(list), seq)
  v2 <- c(0, head(cumsum(lengths(list)), -1))
  l2 <-  Map(`+`, l2, v2)
  m1[!mapply(`%in%`, as.data.frame(m2), l2)] <- 0
  return(m1)
}

# Test with some values being same for multiple vectors,
M = c("1", "3")
N = c("1", "9", "2:9")
P = c("3")

create_mat(list(M, N, P))

#     [,1] [,2]  [,3]
#[1,] "1"  "0"   "0" 
#[2,] "3"  "0"   "0" 
#[3,] "0"  "1"   "0" 
#[4,] "0"  "9"   "0" 
#[5,] "0"  "2:9" "0" 
#[6,] "0"  "0"   "3"

Sorry for this question and it might be in a different thread but I'm not sure. I did what you wrote and it's working perfectly. But now I want to add something more. Is there a way to take this output and create a new matrix with the all possible combinations of the rows of each instance (L,K,S) by two (L , K) , (L,S), (K,S)? e.g something like [this](https://pastebin.com/cLPPB04b). Actually I tried this but adds a lot of replicates of the same think `g = as.list(data.frame(m1)) expand.grid(g)` — J. Doe, Jul 10 '17 at 18:26
I just realize that if the initial vectors are the same your code doesn't seem to work as expected. Can you think a fix for this ? — J. Doe, Jul 10 '17 at 19:19
Thank you very much. Though I encounter another problem too but it has to do with the resources. I realized that if for example we have to list items inside the main list where the first on has only one element `c("0")` and the other one has around 16000 elements `c(1:16000)` , by executing this function you drain out the memory of the system and R gets a fatal error. Do you think that there is a way, we could make it more resources friendly ? P.S I'm running it on a PC with 4GB Ram — J. Doe, Jul 19 '17 at 07:48

score 1 · Answer 3 · answered Jul 09 '17 at 19:26

library(dplyr)

L = c("1", "3")
K = c("2", "9", "2:9")
S = c("7")

Ltable <- tibble(L=L)
Ktable <- tibble(K=K)
Stable <- tibble(S=S)
JoinedMatrix <- Ltable %>% bind_rows(Ktable) %>% bind_rows(Stable) %>% as.matrix() 
JoinedMatrix[which(is.na(JoinedMatrix))] <- "0"

bind_rows from dplyr allows you to bind dataframes (or tibbles) together. Since the three tibbles have differently named columns, they are kept as different columns after joining, filling the missing fields with NA. After that we simply replace all NAs with 0 and we're done.

> JoinedMatrix
     L   K     S  
[1,] "1" "0"   "0"
[2,] "3" "0"   "0"
[3,] "0" "2"   "0"
[4,] "0" "9"   "0"
[5,] "0" "2:9" "0"
[6,] "0" "0"   "7"

Merge vectors with different length into a matrix one after the other

3 Answers3