0

I am considering keeping the data (vectors, lists, etc.) and code (functions) for my problem in a tree structure (a list of lists of lists of...). I do not want to commit to a name for the root node, nor for the next level of nodes. The lists just below the root node are different versions of each other, and I want to be able to compare them in different ways, and build them in different ways, and give them different, arbitrary names. I am presently using the following to build the overall structure:

foo <- function(ref.txt, val.txt) eval(parse(text=paste0(ref.txt, ' <<- ', val.txt)))

A trivial example might be:

root = list()
foo('root$v1', '42')
foo('root$v2', '43')
root
# $v1
# [1] 42
#
# $v2
# [1] 43

A little less trivial, continuing from the previous example:

v3 <- c(42, 43)
foo('root$v3', 'v3')
root
# $v1
# [1] 42
#
# $v2
# [1] 43
#
# $v3
# [1] 42 43

Again, I can't hard code e.g., root$v3 <- v3, because I won't know the name of the root of the list or the names of the next-level nodes until run time.

I am asking for alternatives in part because of @'Joris Meys' comment in the Stack Overflow article, "Why doesn't assign() values to a list element work in R?," who is apparently quoting Lumley's post, "Re: [R] RE: Using a number as a name to access a list." These suggest avoiding parse. However, If I do not know the names until runtime, and do not even know the depth of the path (see Lumley), how is avoiding parse possible?

Ana Nimbus
  • 635
  • 3
  • 16
  • 1
    You just use `[[` instead of `$` and you don't need to parse anything. See `fortune::fortunes(312)` and `fortune::fortunes(343)`. I'd suggest this is essentially a duplicate of [Dynamically select columns using `$` and a vector of column names](https://stackoverflow.com/q/18222286/903061) - you just happen to be using a list instead of a data frame. – Gregor Thomas Nov 03 '17 at 15:59

1 Answers1

1

How about an additional argument for your root list? No paste trickery, no eval trickery, and no need to use <<-, which you should usually avoid...

foo <- function(lst, ref, val) { lst[[ref]] <- val; return(lst) }

root <- list()
root <- foo(root, "v1", 42)
root <- foo(root, "v2", 43)
root

v3 <- c(42, 43)
root <- foo(root, "v3", v3)
root

Edit based on the comments: Here is a function that assigns values to arbitrary entries of nested lists. The ref argument should be a vector of indices, one for each level:

foo <- function(lst, ref, val) {

  lvl <- length(ref)

  # extract the list at depth lvl - 1 from lst,
  # add val to this list and replace val with it,
  # repeat, now descending one level less deep,
  # and so on, until reaching the top level

  for (i in seq_len(lvl)) {

    res <- lst
    for (j in seq_len(lvl - i)) res <- res[[ref[j]]]
    res[[ref[lvl - i + 1]]] <- val
    val <- res

  }

  return(res)

}

(root <- list(a = list(a = 1, b = list(a = 1, b = 2)),
              b = list(a = 1), c = 3))

## $a
## $a$a
## [1] 1
## 
## $a$b
## $a$b$a
## [1] 1
## 
## $a$b$b
## [1] 2
## 
## 
## 
## $b
## $b$a
## [1] 1
## 
## 
## $c
## [1] 3

foo(lst = root, ref = c("a", "b", "c"), val = 3)

## $a
## $a$a
## [1] 1
## 
## $a$b
## $a$b$a
## [1] 1
## 
## $a$b$b
## [1] 2
## 
## $a$b$c
## [1] 3
## 
## 
## 
## $b
## $b$a
## [1] 1
## 
## 
## $c
## [1] 3

And finally, here is a benchmark that compares my function to parse + eval. With three levels of nesting, my function is significantly faster, but that may change with a different list structure:

bar <- function(lst, ref, val) {

  eval(parse(text = paste(paste(c("lst", ref), collapse = "$"), "<- val")))
  return(lst)

}

library(microbenchmark)
microbenchmark(foo(lst = root, ref = c("a", "b", "c"), val = 3),
               bar(lst = root, ref = c("a", "b", "c"), val = 3))

## Unit: microseconds
##                                              expr     min       lq
##  foo(lst = root, ref = c("a", "b", "c"), val = 3)  47.089  48.6700
##  bar(lst = root, ref = c("a", "b", "c"), val = 3) 127.401 128.9505
##       mean  median       uq     max neval
##   55.98703  50.795  53.0640 191.575   100
##  134.71502 130.325 132.1755 291.400   100
bbrot
  • 174
  • 7
  • 1
    The spirit of this is exactly right--no need for `<<-`, `eval`, etc. I do question whether a function is needed at all - your function is essentially a duplicate of `[[<-`. Is `foo(root, "v1", 42)` better than `root[["v1"]] <- 42`? I suppose that's up to OP. – Gregor Thomas Nov 03 '17 at 16:03
  • You are right, I just wanted to keep the spirit of the original code assuming that it is a minimal example and the function actually does more than putting values into a list. – bbrot Nov 03 '17 at 16:18
  • Re: @Gregor . Suppose `root <- list(A=list(B=list(C=list())))`. I would use `root$A$B$C <- bar` to replace the contents three levels below `root`. I could also use `root[['A']][['B']][['C']] <- bar`. Except that I don't know the depth to which I will need to assign until run time, so I don't even know how many `[[` to use, let alone which characters to place between the `[[` and `]]`. Is there another notation, like `[['A','B','C']]` that I don't know about? – Ana Nimbus Nov 03 '17 at 17:43
  • Re: @Gregor . Where may I find documentation of `[[<-`? I have already looked in the language definition (no mention) and in the help (not helpful). Stack Overflow search `"[[<-" [r]` produces no results. I will start a new question for that one. – Ana Nimbus Nov 03 '17 at 17:52
  • ``?`[[<-` `` in R will do. – bbrot Nov 03 '17 at 17:57
  • Okay, but that doesn't come across in your question. Nor does the `foo` example in your question address that. Maybe you should add an example with nesting to your question? Since you say you wouldn't know the depth or the names, (and presumably there would be multiple branches?) I'm curious what you *do* know at run-time. What, exactly would the input be? A list of lists *is not a tree*, and will not work out-of-the-box as such. Searching SO for "double brackets [r]" is a good start, or in R `?"[["` or `help("[[")` also works. – Gregor Thomas Nov 03 '17 at 17:57
  • No, @bbrot, that gives the same result as `help('[[<-')`, which is not helpful. – Ana Nimbus Nov 03 '17 at 18:03
  • Basically I'm asking, in your example of `root$A$B$C <- bar`, what inputs do you have so that you know which 3rd level thing to replace? How do you know it's not `root$X$Y$C <- bar` that you want? Or, if you don't know the depth, maybe it's `root$h$i$j$k$C <- bar`? I assume you have branching, otherwise you wouldn't need nesting at all... Or similarly, if there is only one `C`, why bother nesting? – Gregor Thomas Nov 03 '17 at 18:10
  • That is exactly the point, @Gregor. I do not know the inputs at the time of writing the source that needs to handle the manipulation. Yes, there is branching. Also, re your earlier comment, " A list of lists _is not a tree_" Knuth defines tree as follows: "...a finite set _T_ of one or more nodes such that... there is one... node called the _root_ of the tree... remaining nodes... are partitioned into [zero or more] disjoint sets... and each of these sets is in turn a tree" (TAOCP 1997, Sect. 2.3). FAPP, in my case, I can make a particular tree out of a list of lists. – Ana Nimbus Nov 03 '17 at 18:22
  • Re: @Gregor "why bother nesting." One reason is `rm(entire.tree.structure)`. Another is `pair <- list(entire.tree.structure, entire.tree.structure)`; then modify one of the elements of `pair`; then compare its two elements. – Ana Nimbus Nov 03 '17 at 18:26
  • By "why bother nesting" I don't mean "why put things in a list" - [I'm a huge advocate of using lists](https://stackoverflow.com/questions/17499013/how-do-i-make-a-list-of-data-frames/24376207#24376207), I mean why **nest**, why have `root$A$B$C` instead of `root$A`, `root$B`, `root$C`. Removing an entire branch at once would be a good reason to nest, but removing the entire list at once is not - you can do that whether there is nesting or whether there is just a single list. – Gregor Thomas Nov 03 '17 at 18:35
  • "I do not know the inputs at the time of writing the source that needs to handle the manipulation" - I am asking how you expect to be able to interact with this function. The function cannot read your mind, so how do you want to tell it what to do? Let's make a simple but nontrivial nested example: `root <- list(A1= list(B = list(C = list())), A2 = list(B1 = list(), B2 = list(C=list())))`, what might you want to do this list and how might you want to specify it *at the time of execution*? Adding top-level elements by name is trivial, that was the example in your Q and bbrot's A... what else? – Gregor Thomas Nov 03 '17 at 18:50
  • Re: @Gregor "double brackets [r]" Good suggestion, but Stack Overflow search `title:"double brackets" [r]` produced no useful results. Going through the titles of the results (146 of them) of Stack Overflow search `"double brackets" [r]` produced no useful results. But, perhaps I am misunderstanding the syntax. To clarify, would `alist[['anelementname']] <- 1` be an example implementation of the `[[<-` operator? – Ana Nimbus Nov 03 '17 at 18:50
  • Exactly. You can use a string, or a variable storing a string with double brackets. `root = list(); x = "hi"; y = 1:5`: the following are equivalent: `root$hi = y; root[["hi"]] = y; root[[x]] = y`. – Gregor Thomas Nov 03 '17 at 18:53
  • Back to the trees vs lists - you could definitely implement a tree structure with nested lists, but *you would have to implement it*. Lists are ordered, trees are not, and common tree methods like depth, rebalancing, finding parents, pruning leaves, are not provided with R's list methods. That's what I mean by "trees are not lists". – Gregor Thomas Nov 03 '17 at 18:56
  • And similarly you can use `[[` with nesting to access list elements. `root = list(A = list(B = list(C)))`, `root$A$B$C = 1` is equivalent to `root[['A']][['B']][['C']] = 1`, or `avar = 'A'; bvar = 'B', cvar = 'C'; root[[avar]][[bvar]][[cvar]] = 1`. `[[` also works with index numbers. And `names(root[[avar]][[bvar]])` will return `'C'`. – Gregor Thomas Nov 03 '17 at 19:06
  • Re: edit of @bbrot : That code seems like it will do what I am asking. Some comments would help. Also, your code looks like it's _n_-squared slow, where _n_ is the depth to the element of the modified copy. Is that what it takes to avoid `parse`? – Ana Nimbus Nov 04 '17 at 00:09
  • @AnaNimbus, I don't know. The loops don't do much, they mainly assign variables, so I imagine my code may even be faster than `parse` + `eval` if the nesting is not too deep, but I didn't investigate. There might be other/better ways to implement my function, it was just the first thing that came to my mind. Maybe have a look at the rlist package, https://cran.r-project.org/package=rlist – bbrot Nov 04 '17 at 00:28