0

Consider a case like this:

xml_list <- list(
  a = "7",
  b = list("8"),
  c = list(
    c.a = "7",
    c.b = list("8"), 
    c.c = list("9", "10"),
    c.d = c("11", "12", "13")),
  d = c("a", "b", "c"))

what I'm looking for is a way of how to simplify this construct recursively such that unlist is called on any list of length 1. The expected result for above example would look like:

list(
  a = "7",
  b = "8",
  c = list(
    c.a = "7",
    c.b = "8", 
    c.c = list("9", "10"),
    c.d = c("11", "12", "13")),
 d = c("a", "b", "c"))

I have dabbled with rapply, but that explicitly operates on list-members that are NOT lists themselves, so wrote the following:

library(magrittr)
clean_up_list <- function(xml_list){
  xml_list %>%
    lapply(
      function(x){
        if(is.list(x)){
          if(length(x) == 1){
            x %<>%
              unlist()
          } else {
            x %<>%
              clean_up_list()
          }
        }
        return(x)
      })
}

This, however, I can't even test, as Error: C stack usage 7969588 is too close to the limit (at least on lists that I terminally want to process).

Digging deeper (and after mulling over @Roland's response), I came up with a solution that utilizes purrr-goodness, reversely iterates over list depth and NEARLY does what I want:

clean_up_list <- function(xml_list)
{
  list_depth <- xml_list %>%
    purrr::vec_depth()
  for(dl in rev(sequence(list_depth)))
  {
    xml_list %<>%
      purrr::modify_depth(
        .depth = dl,
        .ragged = TRUE,
        .f = function(x)
        {
          if(is.list(x) && length(x) == 1 && length(x[[1]]) == 1)
          {
            unlist(x, use.names = FALSE)
          } else {
            x
          }
        })
  }
  return(xml_list)
}

This appears to work as intended even for lists of the depth I'm dealing with BUT elements that used to be vectors (like c.d and d in the example) now are converted to lists, which defeats the purpose ... any further insight?

balin
  • 1,554
  • 1
  • 12
  • 26

2 Answers2

1

I don't understand this magrittr stuff, but it's easy to create a recursive function:

foo <- function(L) lapply(L, function(x) {
  if (is.list(x) && length(x) > 1) return(foo(x))
  if (is.list(x) && length(x) == 1) x[[1]] else x
  })
foo(test_list)

#$`a`
# [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S" "T" "U" "V" "W" "X" "Y" "Z"
#
#$b
#[1] "a"
#
#$c
#$c$`c.1`
#[1] "b"
#
#$c$c.2
# [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z"
#
#$c$c.3
#$c$c.3[[1]]
#[1] "c"
#
#$c$c.3[[2]]
#[1] "d"

If this throws an error regarding C stack usuage then you have lists that are deeply nested. You couldn't use recursion then, which would make this a challenging problem. I would then modify the creation of this list if possible. Or you could then try to increase the C stack size.

Roland
  • 127,288
  • 10
  • 191
  • 288
  • Thanks. This does in essence what my own function attempts and fails with the same C stack problem in a real (deeper) list. – balin Dec 07 '18 at 10:17
  • See the last part of my answer. I think you might have an xy problem. – Roland Dec 07 '18 at 11:44
0

With the help of a ticket against the github repository of purrr I solved this: with the current developer's version of purrr (installable via remotes::install_github('tidyverse/purrr')), the purrr-dpendent code in the question works as expected and does not any longer "listify" vectors. That code thus should serve as the answer to the question and become fully functional with CRAN-borne packages after the new year 2018/19.

balin
  • 1,554
  • 1
  • 12
  • 26