0

I have the following list:

$id1
$id1[[1]]
         A              B               
        "A"            "B"                
$id1[[2]]
         A             B 
        "A"           "A1" 
$id2
$id2[[1]]
         A              B               
        "A2"           "B2" 

In R-pastable form:

dat = structure(list(SampleTable = structure(list(id2 = list(structure(c("90", "7"), .Names = c("T", "G")), structure(c("90", "8"), .Names = c("T", "G"))), id1 = structure(c("1", "1"), .Names = c("T", "G"))), .Names = c("id2", "id1"))), .Names = "SampleTable") 

I want this given list to be converted into following dataframe:

id1   A    B
id1   A    A1 
id2   A2   B2 
joran
  • 169,992
  • 32
  • 429
  • 468
jan5
  • 1,129
  • 3
  • 17
  • 28

3 Answers3

5

Your data structure (apparently a named list of unnamed lists of 1-row data.frames) is a bit complicated: the easiest may be to use a loop to build the data.frame.

It can be done directly with do.call, lapply and rbind, but it is not very readable, even if you are familiar with those functions.

# Sample data 
d <- list(
  id1 = list(
    data.frame( x=1, y=1 ),
    data.frame( x=2, y=2 )
  ),
  id2 = list(
    data.frame( x=3, y=3 ),
    data.frame( x=4, y=4 )
  ),
  id3 = list(
    data.frame( x=5, y=5 ),
    data.frame( x=6, y=6 )
  )
)

# Convert
d <- data.frame(
  id=rep(names(d), unlist(lapply(d,length))),
  do.call( rbind, lapply(d, function(u) do.call(rbind, u)) )
)

Other solution, using a loop, if you have a ragged data structure, containing vectors (not data.frames) as explained in the comments.

d <- structure(list(SampleTable = structure(list(id2 = list(structure(c("90", "7"), .Names = c("T", "G")), structure(c("90", "8"), .Names = c("T", "G"))), id1 = structure(c("1", "1"), .Names = c("T", "G"))), .Names = c("id2", "id1"))), .Names = "SampleTable") 
result <- list()
for(i in seq_along(d$SampleTable)) {
  id <- names(d$SampleTable)[i]
  block <- d$SampleTable[[i]]
  if(is.atomic(block)) {
    block <- list(block)
  }
  for(row in block) {
    result <- c(result, list(data.frame(id, as.data.frame(t(row)))))
  }    
}
result <- do.call(rbind, result)
Vincent Zoonekynd
  • 31,893
  • 5
  • 69
  • 78
  • @Sjan if you think this answer solved your problem, feel free to check the grey mark below the answer score. – Roman Luštrik Jan 10 '12 at 10:07
  • still the problem is not solved. problem comes when list contains a list element in it. – jan5 Jan 10 '12 at 10:32
  • 1
    @Sjan: can you provide a reproducible example, e.g., the result of dput() on your data structure? – Vincent Zoonekynd Jan 10 '12 at 10:35
  • structure(list(SampleTable = structure(list(id2 = list(structure(c("90", "7"), .Names = c("T", "G")), structure(c("90", "8"), .Names = c("T", "G"))), id1 = structure(c("1", "1"), .Names = c("T", "G"))), .Names = c("id2", "id1"))), .Names = "SampleTable") – jan5 Jan 10 '12 at 10:46
  • It would be better to add this to your question. – Paul Hiemstra Jan 10 '12 at 11:38
1

NOTE! I could not get melt and cast working on this kind of ragged data (I tried for over an hour...) I am going to leave this answer here to show that for this kind of operation, the reshape pacakge could also be used.

Using the example data of vincent, you can use melt and cast from the reshape package:

library(reshape)
res = cast(melt(d))[-1]
names(res) = c("id","x","y")
res
   id x y
1 id1 1 1
2 id2 3 3
3 id3 5 5
4 id1 2 2
5 id2 4 4
6 id3 6 6

The order in the resulting data.frame is not the same, but the result is identical. And the code is a bit shorter. I use the [-1] to delete the first column which is also returned by melt. This additional variable is the column index of the individual data.frames in the list of lists. Just have a look at the result of melt(d), that will hopefully make it more clear.

Paul Hiemstra
  • 59,984
  • 12
  • 142
  • 149
  • Could you mark it as the correct answer then? It is the grey tick mark on the left hand side of my answer, just beneath the up and down arrow and the 0. This let's people know that your question is answered, and gets met some rep ;). – Paul Hiemstra Jan 10 '12 at 08:56
  • Or mark the other answer as the corrct one if that was more helpful. – Paul Hiemstra Jan 10 '12 at 09:04
  • after runnning res = cast(melt(d))[-1] i got this message Error: Casting formula contains variables not found in molten data: variable – jan5 Jan 11 '12 at 05:37
  • As I said in my note above, I could not get this working with your data, only with vincent's first example. – Paul Hiemstra Jan 11 '12 at 07:52
0

This is a bit messier that you let on. That dat object has an extra "layer" above it, so it is easier to work with dat[[1]]:

dfrm <- data.frame(dat[[1]], stringsAsFactors=FALSE)
names(dfrm) <- sub("\\..+$", "", names(dfrm))

> dfrm
  id2 id2 id1
T  90  90   1
G   7   8   1
> t(dfrm)
    T    G  
id2 "90" "7"
id2 "90" "8"
id1 "1"  "1"
IRTFM
  • 258,963
  • 21
  • 364
  • 487