anova.cca cannot find object within user defined function

Question

I want to execute the same rda analysis sequence (fitting a model, testing significance of the model, the axis, and the term, plotting the data) on subsets of the same datasets. Therefore I wrote a function. The problem now is that the call to the anova.cca function does not work well within a function when I want to test the axis. It cannot find the Y.sub dataset

Error in eval(expr, envir, enclos) : object 'RV.sub' not found

Minimal working example:

library(vegan)
data(dune)
data(dune.env)

rda.subsetfunc <- function(RV, Y){
  #RV.sub <- subset(RV, !Y$Use%in%c("BF"))
  #Y.sub  <- subset(Y, !Y$Use%in%c("BF"))
  RV.sub <- RV; Y.sub <- Y
  rda.mod <- rda(RV.sub ~ Manure, Y.sub)
  axis.test <- anova(rda.mod,  by = "axis")

  return(list(rda.mod, axis.test))
}

rda.subsetfunc(RV = dune, Y = dune.env)

I found some other related questions like here but that seems be a lot more complicated than what I am doing. I tried to implement the do.call approach as is mentioned here but I couldn't get it to work. If it is really not possible to do this without really digging deep into functions I will find a way of programming around it. But to me, it feels like I'm trying to do something that makes total sense. So it is probably more likely that I am doing something wrong than that I am doing something impossible.

I made a silly mistake when typing in the original question. I originally put `rda.subsetfunc(RV = dune, Y = Y.sub)` which doesn't work for obvious reasons. Thanks to @chi-pak for making me realize — Nightingale, Jun 20 '17 at 14:16
That is not the problem because RV.sub is the X matrix and not a column of Y. This is the way the rda-function works. My function works fine if I omit the anova-function . That is the one that is causing the trouble. I really appreciate your help by the way! — Nightingale, Jun 20 '17 at 14:38
From the help-page of rda. `data(dune) data(dune.env) dune.Manure <- rda(dune ~ Manure, dune.env)` So RV.sub not being in Y.sub is correct. — Nightingale, Jun 20 '17 at 14:41
Yes, you're right, but I still think it's part of the problem. Look at `dune.Manure`. It shows this: `rda(formula = dune ~ Manure, data = dune.env)`. For some reason, this `dune.Manure` looks in the `parent` environment for variables...If you declare `RV.sub <- dune` in the `parent`, the function does not return an error. — CPak, Jun 20 '17 at 14:58
The reason I didn't want to declare `RV.sub <- dune` in the parent is that my code is a bit more elaborate than just taking a subset. But I'll just make two consecutive scripts then. One to create the subsets, and one that takes the subsets and does the rda. It really is a pity though. And again, I really appreciate you taking the time to help me. — Nightingale, Jun 20 '17 at 15:13
This a scoping issue in `anova.cca(..., by="axis")` which should find items from several different environments when it is updating the formula. Embedding the function really may fail. However, re-designed function in https://github.com/vegandevs/vegan seemed to work with this example. — Jari Oksanen, Jun 20 '17 at 19:08
Thanks @jari for the confirmation that this really is not currently possible. I will refrain from using the developer code at this stage. I intend to publish my code and data with my paper and feel that using this updated function would make things overly complicated. However, it is definitely solid advice for others who might stumble upon the same issue. If you post it as an answer I'll accept. — Nightingale, Jun 21 '17 at 10:30

score 2 · Accepted Answer · answered Jun 22 '17 at 05:43

This a scoping issue in anova.cca(..., by="axis") which should find items from several different environments when it is updating the formula (I won't go into technical details). You really cannot embed the function to analyse the significances of axes. This is a known issue. We have solved that in the development version of vegan. The re-designed function in https://github.com/vegandevs/vegan seemed to work with this example. All ordination and significance functions are radically changed there, and they are not yet completely finished. We plan to release them in vegan 2.5-0 in the last quarter of 2017, but they are not finished yet.

The problem is that anova.cca(..., by = "axis") must find items that it builds within the function, and in addition it can find items that were available when the original model was built, but it cannot find items that you generate in functions that embed the function. You must circumvent this by making your embedding function to write its objects somewhere that these can be found. The easiest (but dirty) solution is to write them into parent environment with <<-. The following version of your function adds this <<- and seems to work in vegan 2.4-3

rda.subsetfunc <- function(RV, Y){
  RV.sub <<- RV; Y.sub <- Y
  rda.mod <- rda(RV.sub ~ Manure, Y.sub)
  axis.test <- anova(rda.mod,  by = "axis")

  list(rda.mod, axis.test)
}

rda.subsetfunc(RV = dune, Y = dune.env)

Oh wow, that is a neat (yet slightly dangerous) trick! Thanks — Nightingale, Jun 22 '17 at 14:33

anova.cca cannot find object within user defined function

1 Answers1