0

Is there a neat way to convert a nested data.frame to a hierarchical list?

I do it below with a for loop, but ideally there is a neater solution that generalizes to an arbitrary number of nested columns.

nested_df <- expand.grid(V1 = c('a','b','c'),
                         V2 = c('z','y'))%>%
    group_by_all()%>%
    do(x=runif(10))%>%
    ungroup

nested_ls <- list()
for(v1 in unique(nested_df$V1)){
    for(v2 in unique(nested_df$V2)){
        nested_ls[[v1]][[v2]] <- nested_df%>%
            filter(V1==v1 & V2==v2)%>%
            pull(x)%>%
            unlist
    }
}

str(nested_ls)
dule arnaux
  • 3,500
  • 2
  • 14
  • 21
  • 1
    Did you mean `expand.grid`? – Sotos Mar 20 '19 at 15:42
  • @Sotos there is `tidyr::expand_grid` :D – Ronak Shah Mar 20 '19 at 15:59
  • 2
    @RonakShah what the hell are they trying to do to us?? :) – Sotos Mar 20 '19 at 15:59
  • yes, `expand.grid`. corrected the Q. `tidyr::expand_grid` that didn't use factors would be handy though. – dule arnaux Mar 20 '19 at 17:07
  • All I can think of is some version of nested loops, and I don't find those easy to generalize to more nesting variables. From what I've seen it looks like most questions are about going from a nested list to a data.frame, and not the other way around. :D You might consider adding info on what you need this structure for in case there is a whole different approach (i.e., this could be an X-Y problem :) ). – aosmith Mar 20 '19 at 18:11
  • In this case, when I'm working interactively, its easier to access the nested data when its in a hierarchical list. e.g. nested_ls$a$y. And I have scripts that expect this format. – dule arnaux Mar 20 '19 at 18:41

1 Answers1

3

If you are not very strict with the names z and y, and can also work with [[1]] and [[2]], then you can directly do,

split(nested_df$x, nested_df$V1)

If you need the names, then

lapply(split(nested_df, nested_df$V1), function(i)split(i$x, i$V2))

#Or as @Frank mentions in comments, we can use setNames
lapply(split(nested_df, nested_df$V1), function(i) setNames(i$x, i$V2))
Sotos
  • 51,121
  • 6
  • 32
  • 66