How to form a data.frame from four dynamic and irregular vectors which update?

Question

I have four vectors of different/changing length, say A,B,C,D . I would like to build a master vector which is made up of the top row of Vectors in A-D, and then the second highest rows of A-D.

c(A[1,1],B[1,1],C[1,1],D[1,1],A[2,1],B[2,1] ...and so on

I had this idea of cbinding A,B,C,D and then transposing each row, but due to the different lengths in A-D this cannot be done.

Note:Though A-D are of different sizes, over time they will grow as data is added.

Possible duplicate of [How to cbind or rbind different lengths vectors without repeating the elements of the shorter vectors?](http://stackoverflow.com/questions/3699405/how-to-cbind-or-rbind-different-lengths-vectors-without-repeating-the-elements-o) — Sotos, Aug 18 '16 at 09:54
Questions: (1) You appear to be indexing your "vectors" as matrices, working with the first columns in all cases. Should we ignore that detail? (2) How do you want to handle short vectors? For example, say `A` only has one element. Do you want `NA` in the position where `A[2,1]` would occur, do you want to skip that element in the output, or something else? — bgoldst, Aug 18 '16 at 09:58
They are vectors (only one column). And yes NA would be preferable. — B.Doe, Aug 18 '16 at 10:05

akrun · Answer 1 · 2016-08-18T10:22:38.823

We place the vectors in a list, then make the length of the list same by padding NA at the end by assigning length (length<-) to the max of lengths of 'lst', convert to data.frame, using apply with MARGIN=1, we sort the elements of each row and unlist it.

d1 <- data.frame(lapply(lst, `length<-`, max(lengths(lst))))
unlist(apply(d1, 1, FUN = function(x) sort(x)))

data

lst <- list(A = c(1, 5, 7, 5), B = c(3, 4, 6, 2, 5), C  = c(5, 3, 2), 
             D = c(8, 1, 3, 5, 6))

score 0 · Answer 2 · answered Aug 18 '16 at 10:22

You should first assemble the loose variables into a list to facilitate automated processing of them all without duplicate code. Then we can effectively iterate over the index range of the longest vector and index out the element that resides at that index in all input vectors. Out-of-bounds indexing of an atomic vector naturally returns NA, satisfying your requirement for the case of short vectors. Also, since you only care about the first column of each input matrix, it will help to only assemble those columns during the initial assembly step.

lst <- lapply(mget(LETTERS[1:4]),`[`,,1L);
lst;
## $A
## [1] 37 56
##
## $B
## [1] 94 65 61  5 19
##
## $C
## [1] 99 37 76 90
##
## $D
## [1]  1 37
##
c(sapply(seq_len(max(lengths(lst))),function(i) sapply(lst,`[`,i)));
##  [1] 37 94 99  1 56 65 37 37 NA 61 76 NA NA  5 90 NA NA 19 NA NA

Data

set.seed(1L);
for (n in LETTERS[1:4]) assign(n,matrix(sample(0:99,sample(1:5,1L)*2L),ncol=2L));

score 0 · Accepted Answer · edited May 23 '17 at 10:28

Right I found my solution. Using this cbind.fill code quoted here: cbind a df with an empty df (cbind.fill?)

cbind.fill <- function(...){
nm <- list(...) 
nm <- lapply(nm, as.matrix)
n <- max(sapply(nm, nrow)) 
do.call(cbind, lapply(nm, function (x) 
    rbind(x, matrix(, n-nrow(x), ncol(x))))) 
}

Then converting it to a master vector which takes the top rows of A-D using:

X<-matrix(t(X),ncol=1)

How to form a data.frame from four dynamic and irregular vectors which update?

3 Answers3

data