1

I have a list that looks something like this, where the variables are of various lengths, and in a random order

> my.list <- lapply(list(c(2,1,3,4),c(2,3),c(4,2,3),c(1,3,4),c(1,4),c(2,4,1)),
  function(x)letters[x])
> my.list

>my.list   
[[1]]
[1] "b" "a" "c" "d"

[[2]]
[1] "b" "c"

[[3]]
[1] "d" "b" "c"

[[4]]
[1] "a" "c" "d"

[[5]]
[1] "a" "d"

[[6]]
[1] "b" "d" "a"

What I want to do is put this into a data frame, with NA where there are blanks. However, each row is in a random order, and I want each row in the data frame to be ordered such that it goes in alphabetical or numeric. Ideally the end result would look like the example below

>df
    V1  V2  V3  V4
1   a   b   c   d  
2   NA  b   c   NA  
3   NA  b   c   d  
4   a   NA  c   d  
5   a   NA  NA  d  
6   a   b   NA  d  
Frank
  • 66,179
  • 8
  • 96
  • 180
ricks.k
  • 101
  • 3
  • 1
    I think you will/should want the strings as the headers and a 0/1 table instead, which brings us back to this question from earlier today: http://stackoverflow.com/q/29988256/1191259 – Frank May 01 '15 at 21:16
  • Also, it's best to make your example reproducible (as I've edited in). Here's a reference: http://stackoverflow.com/q/5963269/1191259 – Frank May 01 '15 at 21:21

2 Answers2

3

You could use a lookup vector and match against it.

m <- sort(Reduce(union, my.list))
as.data.frame(do.call(rbind, lapply(my.list, function(a) a[match(m, a)])))
#     V1   V2   V3   V4
# 1    a    b    c    d
# 2 <NA>    b    c <NA>
# 3 <NA>    b    c    d
# 4    a <NA>    c    d
# 5    a <NA> <NA>    d
# 6    a    b <NA>    d
Rich Scriven
  • 97,041
  • 11
  • 181
  • 245
2

One option is

library(qdapTools)
d1 <- mtabulate(my.list)
d1
#  a b c d
#1 1 1 1 1
#2 0 1 1 0
#3 0 1 1 1
#4 1 0 1 1
#5 1 0 0 1
#6 1 1 0 1

d2 <- d1
d2[] <- colnames(d1)[col(d1)]
is.na(d2) <- d1==0
 colnames(d2) <- paste0("V", 1:4)
 d2
 #    V1   V2   V3   V4
 #1    a    b    c    d
 #2 <NA>    b    c <NA>
 #3 <NA>    b    c    d
 #4    a <NA>    c    d
 #5    a <NA> <NA>    d
 #6    a    b <NA>    d

Or

 d2[] <- names(d1)[(NA^!d1) * col(d1)]
 colnames(d2) <- paste0('V', 1:4)

data

my.list <- list(c("b", "a", "c", "d"), c("b", "c"), c("d", "b", "c"), 
c("a", "c", "d"), c("a", "d"), c("b", "d", "a"))
Frank
  • 66,179
  • 8
  • 96
  • 180
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    `is.na(d2) <- d1==0` is really strange to see, but looks useful (now that I understand what it does). – Frank May 01 '15 at 21:39