0

I need to repeatedly add a vector to a matrix. Both take on different lengths everytime I do this. The complete matrix is then used for further analysis (plotting, t-test) Three months ago this code worked:

    mlen <- max(length(matrix), length(vector))
    length(maxtrix) <- length(vector) <- mlen
    matrix <- cbind(matrix, vector)

I don't use any specific packages for that. Data input is unchanged a csv file. Now I have either of the following the issues:

a) the unequal length function doesn't work properly anymore. I.e. if the new vector has 970 'rows' but the longest column in the existing matrix has only 270 rows, then the remaining 500 rows of the added vector just get cut off. The warning message is In function (..., deparse.level = 1) : number of rows of result is not a multiple of vector length (arg 2) This doesn't always happen.

b) the values of the vector that is added get placed in empty cells at the bottom of an existing column in the matrix.

Both seriously screws up my further analysis. I have tried to use do.call(cbind...) as suggested here, merge, or append. Nothing procudes the output I need, which is a matrix with 1 column per vector witout any data loss or mixing.

Thanks!

Up-date: Above code lines are part of code doing the following: data import (which vary in size) - data cleaning (data varies even more in size) - storing data in a matrix or dataframe - calculating mean per column, plot / t-test data

Throwing everyting in a list and the creating a matrix is not useful for me unless the original data structure can be preserved.

Community
  • 1
  • 1
Simone
  • 497
  • 5
  • 19
  • 1
    length(matrix) is not what you believe it is. You want nrow(matrix). – Roland Jul 23 '16 at 11:03
  • 1
    I think you want Tyler's answer, [here](http://stackoverflow.com/questions/7962267/cbind-a-df-with-an-empty-df-cbind-fill), which was "stolen" from an [R help page](http://r.789695.n4.nabble.com/How-to-join-matrices-of-different-row-length-from-a-list-td3177212.html). Not sure how to mark a question as duplicate. – shayaa Jul 23 '16 at 11:10
  • What probably happened in the past is that the new data was of a length that allowed for clean recycling (it was a multiple of the old data). Depending on your use, I'd suggest keeping the data in a list. as this is the most natural storage structure for your data. – lmo Jul 23 '16 at 11:42
  • @shayaa@Imo Storing it in a list is not useful for me as I need the mean of each column, which I then need for a t-test. So I might need a completely different solution – Simone Jul 25 '16 at 06:10
  • @Roland thanks for the suggestions. nrow doesn't work properly for the first row and not at all for the 2nd of the code. "Error in nrow(tempResult) <- mlen : could not find function "nrow<-" Did a simple trial, i.e. nrow(vector) and received NULL as output. It works for the matrix though. – Simone Jul 25 '16 at 06:24
  • `mlen <- max(nrow(matrix), length(vector))` and then `dim(matrix) <- c(mlen, ncol(matrix))`. My point was that the length of a matrix is not what you think it is. – Roland Jul 25 '16 at 08:16
  • @Roland Thanks for this! Got your message about length the first time. read up on nrow in the help function, thus found out that it doesn't work for vectors. Second step produces that: Error in dim(results) <- c(mlen, ncol(results)) : dims [product 804] do not match the length of object [159] If something comes to your mind? I'll try and see what I can work out – Simone Jul 25 '16 at 09:51
  • Provide a reproducible example and I can have a look. – Roland Jul 25 '16 at 10:11
  • @shayaa Thanks for suggestion. It worked – Simone Jul 25 '16 at 14:07
  • @Roland merci für d'Hilf. Schöne Abig :-) – Simone Jul 25 '16 at 14:20

2 Answers2

1

Implemented Tyler's solution here. For completion purposes here is the code again:

   cbind.fill <- function(...){
     nm <- list(...) 
     nm <- lapply(nm, as.matrix)
     n <- max(sapply(nm, nrow)) 
     do.call(cbind, lapply(nm, function (x) 
     rbind(x, matrix(, n-nrow(x), ncol(x))))) 
    }
   matrix <- cbind.fill(matrix, vector)

Using nrow resulted in the new data being written in NA cells of previous columns instead of a new column. For all those interested in the difference between nrow and length

Community
  • 1
  • 1
Simone
  • 497
  • 5
  • 19
0

A potentially easier solution could be the following:

  1. Store all your vectors in a list instead of appending them one by one
  2. Make them the same length filling the missing items with NA
  3. cbind everything into a matrix

A mock up example:

library(dplyr)

ll <- list(c(1,2,3,4,5), c(2,3), c(5,6,7,8,12,13,14,15))
ll

lapply(ll, function(x) x[1: max(sapply(ll, length))]) %>% do.call(cbind, .) 

The output is:

    [,1] [,2] [,3]
[1,]    1    2    5
[2,]    2    3    6
[3,]    3   NA    7
[4,]    4   NA    8
[5,]    5   NA   12
[6,]   NA   NA   13
[7,]   NA   NA   14
[8,]   NA   NA   15
thepule
  • 1,721
  • 1
  • 12
  • 22
  • `bind_cols` requires data frames to work on, since @Simone needs a matrix output, it seemed an extra step to convert all vectors into data frames. – thepule Jul 23 '16 at 11:59
  • Thanks for the suggestion. It is not very practicable for me because a) it treats the list created from the matrix with several columns as one column in the newly created matrix b) throws all values together - I need the mean per vector for a t-test. – Simone Jul 25 '16 at 14:16