I have a large dataset (2.8m rows x 4 columns) in R that I'm trying to transpose. I was attempting to use the reshape2::cast function to do the transpose but it's running out of memory.
Question 1: is there a better way to do the transpose?
Question 2: I am attempting to chop the data set up into pieces, do the transpose on the pieces and then reassemble it. However, I'm running into an issue on the reassembly step where cbind requires I know in advance which columns I want to join on. is there a clever way around this issue?
bigtranspose<-function(dataset){
n<-nrow(dataset)
i<-1
while (i<=n){
#take 10 rows at a time and do the transpose
UB <- min(i+10, n)
small<-dataset[i:UB,]
smallmelt<-melt(small, id=c("memberID", "merchantID"))
t<-dcast(smallmelt, memberID~merchantID, na.rm=TRUE)
#stack the results together
if ( !exists("finaldataset") )
finaldataset<-t
else
finaldataset<-rbind(finaldataset,t)
i <- i+10+1
}
}