11

How can I convert a data.frame

df <- data.frame(id=c("af1", "af2"), start=c(100, 115), end=c(114,121))

To a list of lists

LoL <- list(list(id="af1", start=100, end=114), list(id="af2", start=115, end=121))

I've tried things like

not.LoL <- as.list(as.data.frame(t(df)))

and I'm really not sure what I end up with after this, but it isn't quite right. My requirement is that I can access, say, the first start by the command

> LoL[[1]]$start
[1] 100

the not.LoL that I currently have gives me the following error:

> not.LoL[[1]]$start
Error in not.LoL[[1]]$start : $ operator is invalid for atomic vectors

Explanations and/or solutions would be greatly appreciated.

Edit: I should have made it clear that "id" here is actually non-unique - there can be multiple elements under a single ID. So I could do with a solution that doesn't depend on unique IDs to split on.

MattLBeck
  • 5,701
  • 7
  • 40
  • 56
  • possible duplicate of [Reshape matrix into a list of lists](http://stackoverflow.com/questions/9195112/reshape-matrix-into-a-list-of-lists) – agstudy Feb 06 '13 at 13:29
  • 1
    @agstudy: not a duplicate: that one is about ragged arrays and `tapply`, while this one here appears to be rectangular and therefore can be solved using `lapply` as [shown below](http://stackoverflow.com/a/14730102/1468366). – MvG Feb 06 '13 at 13:53
  • 1
    @MvG No . See the first solution, he proposes 2 solutions, one with `lapply` which is clearly the same solution proposed here. and second answer using dlply like mine here. – agstudy Feb 06 '13 at 13:59
  • @MvG `lapply` without split? Can you detail this in an answer please. – agstudy Feb 06 '13 at 14:20
  • @MvG You are correct - I have no requirement that IDs should be unique. perhaps I should not have called that column "id". The solutions supplied work for me now, but I want to avoid the requirement if possible – MattLBeck Feb 06 '13 at 14:21
  • @agstudy, it seems I misread [my favorite answer so far](http://stackoverflow.com/a/14730102/1468366), missing the fact that it, too, grouped by `id`. I was reading what I had in mind, not what was actually there… – MvG Feb 06 '13 at 14:26

5 Answers5

10
LMAo <- lapply(split(df,df$id), function(x) as.list(x)) # is one way

# more succinctly
# LMAo <- lapply(split(df,df$id), as.list)

An edited solution as per your comment:

lapply( split(df,seq_along(df[,1])), as.list)
user1317221_G
  • 15,087
  • 3
  • 52
  • 78
  • This is great and works for my purposes at the moment. However, like @MvG said it is grouping by ID, and I actually don't require that IDs are unique. Is there a way around this? – MattLBeck Feb 06 '13 at 14:19
  • can the downvote, explain what is wrong with what I have done, and be a bit ore constructive? – user1317221_G Feb 06 '13 at 14:35
  • 1
    @user1317221_G why donwvoting this solution?? +1! – agstudy Feb 06 '13 at 14:36
8

You can use apply to turn your data frame into a list of lists like this:

LoL <- apply(df,1,as.list)

However, this will change all your data to text, as it passes a single atomic vector to the function.

MvG
  • 57,380
  • 22
  • 148
  • 276
7

Using plyr , you can do this

dlply(df,.(id),c)

To avoid grouping by id , if there are multiple ( maybe you need to change column name , id is unique for me)

dlply(df,1,c)
agstudy
  • 119,832
  • 17
  • 199
  • 261
  • Sorry, using `id` for that column is confusing. Thanks for the `plyr` solution! – MattLBeck Feb 06 '13 at 14:28
  • @agstudy after using the second solution in a problem when ID's were not unique, I found that the solution will actually group by the first column. Am I missing something here? – MattLBeck Mar 04 '13 at 12:42
  • @Mattrition yes you're right. `dlply(df,2,c)` (choose the second column). This solution is not working when the column is not unique. You can accept another solution. – agstudy Mar 04 '13 at 12:57
  • @agstudy The other solutions have their own problems. A simply way around this is to create a column containing the row names, and then split on that column. This is what I am using. – MattLBeck Mar 04 '13 at 13:35
1

In base R, it's quite a bit faster to use mapply instead of split or lapply - however, you have to invoke it via do.call so that each column is used independently.

df <- sleep

f <- function(df) {
  lapply(seq_len(nrow(df)), function(row) {
    df[row, , drop = FALSE]
  })
}

f2 <- function(df) {
  do.call("mapply", c(list, df, SIMPLIFY = FALSE, USE.NAMES=FALSE))
}

f3 <- function(df) {
  split(df, seq(nrow(df)))
}

microbenchmark::microbenchmark(f(df), f2(df), f3(df))
#> Unit: microseconds
#>    expr     min       lq     mean   median       uq       max neval
#>   f(df) 573.799 607.8375 759.1721 626.0095 752.9465  2861.961   100
#>  f2(df) 114.819 123.5190 155.5185 129.9210 141.4340  1375.573   100
#>  f3(df) 598.774 625.6025 813.6837 634.5855 684.3825 11230.678   100

Created on 2019-10-09 by the reprex package (v0.3.0)

Neal Fultz
  • 9,282
  • 1
  • 39
  • 60
0

If, like me, you are mostly looking to create lists of lists to use in highcharter, that same package contains the function list_parse() (or list_parse2() if you want to remove names). Simply use it like so:

library(highcharter)

df <- data.frame(id=c("af1", "af2"), start=c(100, 115), end=c(114,121))

LoL <- list_parse(df)

After which you can do the indexing you wanted:

> LoL[[1]]$start
[1] 100
rvrvrv
  • 881
  • 3
  • 9
  • 29