0

I asked this question Pasting columns of matrix with a specific range in R and this method was suggested.

  g <- rep(1:2, each = 3)
  t(apply(a, 1, tapply, g, paste, collapse = " "))

Unfortunately this method is too slow, and I am looking for fast way and more flexible way.

In fact I am going to write it as a function in C++ if there is not any other way and call results to R.

Community
  • 1
  • 1
user1436187
  • 3,252
  • 3
  • 26
  • 59
  • why don't you comment on the size of your real matrix and what would be an acceptable time? – flodel Nov 08 '13 at 01:09
  • The matrix is 5000 by 40000! – user1436187 Nov 08 '13 at 01:12
  • Do you have enough RAM? It's going to take 8-12 GB to work with an object of that size, more if the character elements are large. – IRTFM Nov 08 '13 at 01:15
  • 1
    40000 is not divisible by 3, are you also using 3 in you real situation? – flodel Nov 08 '13 at 01:17
  • Yes, then I should ignore the last column. I have ample amount of memory, the memory is not an issue. – user1436187 Nov 08 '13 at 02:20
  • @user1436187: I saw in a comment of yours -at andrei 's answer- that you look for a "rolling paste". You can try something like: `library(zoo)` ; `a=matrix(1:30,5)` ; `t(rollapply(1:ncol(a), width = 3, function(x) paste(a[,x[1]], a[,x[2]], a[,x[3]])))`. – alexis_laz Nov 08 '13 at 13:58

1 Answers1

1

I've been using this function to paste columns of a data frame

pasteDFcol <- function(mydf,clmnnames=c("V1","V2"),sepChar=" "){
    do.call("paste",c(mydf[clmnnames],sep=sepChar))
}   

You can pass your matrix using as.data.frame, and specify which columns you want to join, e.g.

a=matrix(1:600000,ncol=600)
a1.df <- data.frame(V1=pasteDFcol(as.data.frame(a),clmnnames=paste0("V",1:300)),
              V2=pasteDFcol(as.data.frame(a),clmnnames=paste0("V",301:600)))
a2 <- as.matrix(a1.df)

It is about twice as fast as the method you were using.

For pasting columns 1:4, 5:8... or any other rolling frame, lets modify the function to take starting column, and number of columns to be pasted as arguments, then use sapply.

pasteDFcol <- function(clmStart, clmNum=4, mydf, sepChar=" "){
    do.call("paste",c(mydf[paste0("V",clmStart:(clmStart+clmNum-1))],sep=sepChar))
}
a=matrix(1:400,ncol=40)
pasteDFcol(clmStart=1, clmNum=4,mydf=as.data.frame(a))
a1 <- sapply(seq(1, 40, by=4), pasteDFcol, clmNum=4, mydf=as.data.frame(a))
ndr
  • 1,427
  • 10
  • 11