Column to nested list separated by /

Question

I have a column where each row element is serpearted by "/":

data.frame(column=c("a","a/air","a/aero/breath","b","b/boy","b/bag/band/brand"))

How can I convert it into nested lists after each "/". So the aim is to get:

list(a=list("air"=1,aero=list("breath"=1)),b=list("boy"=1,bag=list(band=list("brand"=1))))

I need this for the shinyTree package to make a tree out of the column.

I have added the "=1" at the end of the last elements in the hierarchy as it is required to show up in the shinyTree output. The list can then be put in the code below to get the tree:

library(shiny)
library(shinyTree)

tree <- list(a=list("air"=1,aero=list("breath"=1)),b=list("boy"=1,bag=list(band=list("brand"=1)))) 


typeof(tree)

ui <- fluidPage(
  fluidPage(
    sidebarLayout(
      sidebarPanel(
        actionButton('reset', 'Reset nodes')
      ),
      mainPanel(
        shinyTree("tree", ),
        hr(),
        "Selected nodes:",
        verbatimTextOutput("idSelected")#,
      )
    )
  )
)

server <- function(input, output, session) {
  
  treeSelection <- reactiveVal(list())
  
  output$tree = renderTree({
    tree
  })
  
  observeEvent(input$reset, {
    updateTree(session, "tree", data = tree)
    treeSelection(list())
  })
  
  observeEvent(input$tree, {
    treeSelection(get_selected(input$tree, format = "classid"))
  })
  
  output$idSelected <- renderPrint({
    treeSelection()
  })
  
}

shinyApp(ui, server)

Could you please add the input data and what the output should look like in a reproducible format which is easier to copy. Read about how to give a reproducible example here https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example — Jakub.Novotny, Feb 17 '21 at 13:02
I have edited the question. In short I want to transform the dataframe column into a list as shown above. I hope it is clearer now. @Jakub.Novotny — Sahib, Feb 17 '21 at 13:25

Jonas · Accepted Answer · 2021-02-17T14:51:06.027

Since the variables look like paths, I created the sample data as vector like

paths <- c(
  "a",
  "a/air",
  "a/aero/breath",
  "b",
  "b/boy",
  "b/bag/band/brand"
)

Then you can use the following function to get your nested list. I hope the choice of variablenames is explanatory enough.

pathsToNestedList <- function(x) {
  pathSplit <- strsplit(x,"/")
  pathStarts <- sapply(pathSplit,"[[",1)
  uniquePathStarts <- unique(pathStarts)
  
  pathEnds <- sapply(pathSplit, function(pathParts) {
    if(length(pathParts) <= 1) return("")
    paste0(pathParts[2:length(pathParts)],collapse="/")
  })
  
  splitLengths <- sapply(pathSplit,length)
  stillToParse <- unique(pathStarts[splitLengths > 1])
  
  endedIndices <- pathEnds == ""
  endedHere <- pathStarts[endedIndices]
  endedHere <- setdiff(endedHere,stillToParse)
  
  if(length(endedHere)) {
    pathEnds <- pathEnds[!endedIndices]
    pathStarts <- pathStarts[!endedIndices]
    uniquePathStarts <- unique(pathStarts)
    return(c(
      setNames(as.list(rep(1,length(endedHere))),endedHere),
      setNames(lapply(uniquePathStarts, function(ps) {
        pathsToNestedList(pathEnds[pathStarts == ps])
      }),uniquePathStarts)
    ))
  } else {
    return(
      setNames(lapply(uniquePathStarts, function(ps) {
        pathsToNestedList(pathEnds[!endedIndices & (pathStarts == ps)])
      }),uniquePathStarts))
  }
}

Note: I updated my answer according to your updated question.

Update: The function can be simplified to:

pathsToNestedList <- function(x) {
  nonNaIndices <- !is.na(x)
  nonEmptyIndices <- x != ""
  x <- x[nonNaIndices & nonEmptyIndices]
  if(!length(x)) return()
  
  pathSplit <- strsplit(x,"/")
  pathStarts <- sapply(pathSplit,"[[",1)
  
  pathEnds <- sapply(pathSplit, function(pathParts) {
    if(length(pathParts) <= 1) return("")
    paste0(pathParts[2:length(pathParts)],collapse="/")
  })
  
  splitLengths <- sapply(pathSplit,length)
  stillToParse <- unique(pathStarts[splitLengths > 1])
  
  endedIndices <- pathEnds == ""
  endedHere <- pathStarts[endedIndices]
  endedHere <- setdiff(endedHere,stillToParse)
  
  pathEnds <- pathEnds[!endedIndices]
  pathStarts <- pathStarts[!endedIndices]
  uniquePathStarts <- unique(pathStarts)
  
  #Concatenate the list of paths that ended with a list that is parsed again.
  #If one of those lists is empty, the concatenation behaves like
  #one would expect: It does nothing.
  return(
    c(setNames(as.list(rep(1,length(endedHere))),endedHere),
      setNames(lapply(uniquePathStarts, function(ps) {
        pathsToNestedList(pathEnds[pathStarts == ps])
      }),uniquePathStarts)
    )
  )
}

Moreover I recognized that it crashes with NA and empty strings. Hence I added a removal part in the beginning of the function.

The code creates the list perfectly, however is there a way to make the last element of each list which doesn't equal to a list to "=1"? This is so the shinyTree can read it. Thanks @Jonas — Sahib, Feb 17 '21 at 14:10
I already updated the code, it produces exactly your desired output. Endingelements are not anymore stored as a character-element of a list, they are now the names of a list with element 1. — Jonas, Feb 17 '21 at 14:11
It is even possible to get rid of the `if`s. Now the final version should be much easier to understand. — Jonas, Feb 17 '21 at 14:28

Joris C. · Answer 2 · 2021-02-17T19:20:45.587

Another option is to use rrapply() in the rrapply-package, which has a dedicated option how = "unmelt" to unmelt a data.frame to a nested list:

library(rrapply)
library(data.table)

paths <- c("a","a/air","a/aero/breath","b","b/boy","b/bag/band/brand")

## create data.frame/data.table with node paths
paths_melt <- as.data.table(tstrsplit(paths[grepl("/", paths)], split = "/"))
paths_melt[, value := 1L]
paths_melt
#>    V1   V2     V3    V4 value
#> 1:  a  air   <NA>  <NA>     1
#> 2:  a aero breath  <NA>     1
#> 3:  b  boy   <NA>  <NA>     1
#> 4:  b  bag   band brand     1

## unmelt to nested list
rrapply(paths_melt, how = "unmelt")
#> $a
#> $a$air
#> [1] 1
#> 
#> $a$aero
#> $a$aero$breath
#> [1] 1
#> 
#> 
#> 
#> $b
#> $b$boy
#> [1] 1
#> 
#> $b$bag
#> $b$bag$band
#> $b$bag$band$brand
#> [1] 1

Just had a look at the package, seems like you are the author. Therefore a big thank you to you. Working with R, I already coded several workarounds "expanding" the base `rapply` and knowing your package now looks like a big time saver in future! — Jonas, Feb 18 '21 at 07:27

Column to nested list separated by /

2 Answers2