0

I am constructing an approximating function recursively (adaboost). I would like to create the resulting learning function along the way (not to apply the approximation directly to my test data but keep the function that leads to it)

unfortunately, it seems that R updates the value to which a variable name refers to long after it is used.

#defined in plyr as well
id <- function(x) {x}

#my first classifier 
modelprevious <- function(inputx, k) { k(0)}

#one step of my superb model
modelf <- function(x) 2*x #for instance

#I update my classifier
modelCurrent <- function(inputx, k) 
                 { modelprevious(inputx, function(res) {k(res + modelf(inputx))})}

#it works
modelCurrent(2,id) #4

#Problem
modelf <- function(x) 3*x
modelCurrent(2,id) #6 WTF !! 

The same function with the same argument return something different, which is quite annoying !

So how is it possible to capture the value represented by modelf so that the resulting function only depends on its argument at the time of the binding, and not of some global state ?


Given that problem I dont see how one can do a recursive function building in R if one can not touch local variable, apart going through ugly hacks of quote/parse

nicolas
  • 9,549
  • 3
  • 39
  • 83
  • I'm baffled as to why you're confused. `modelCurrent` (watch the typo there, btw) depends upon `modelf`. You changed the definition of `modelf`... – joran Aug 11 '13 at 03:42
  • I would like the value of modelf to be captured, and not be subject to spooky action at a distance. – nicolas Aug 11 '13 at 03:43
  • 2
    There is absolutely no spooky action at a distance happening here. You _explicitly_ altered the definition of `modelCurrent`. – joran Aug 11 '13 at 03:46
  • I have to use a name for my previous accumulated function. now if I want to update my previous classifier with the new classifier and recurse, I have to reuse that name, say "previousClassifier". How can I do recursion if I can not touch the names used previously ? – nicolas Aug 11 '13 at 03:48
  • when you reuse a variable name, say x, to store a number, do you expect all your *previous* variables that used x to be computed to change as well ? – nicolas Aug 11 '13 at 03:50
  • 2
    If those variables hold functions, then _yes_, I absolutely do. Or actually, this will happen due to lexical scoping even if you replaced it with a simple scalar. I frankly don't understand (a) what you're trying to do and (b) why this behavior is in any way strange. – joran Aug 11 '13 at 03:51
  • Maybe this will help: when you define the function `modelCurrent`, nothing is actually evaluated. Each symbol in the function definition is only evaluated when it's actually needed. So the first time `modelf(inputx)` is needed, it finds one definition, the second time, you've redefined it. – joran Aug 11 '13 at 03:57
  • A function or a number should be on equal footing as bindings are concerned. otherwise just doing the same operation, in a function would change the behaviour of your program. what a nightmare ! – nicolas Aug 11 '13 at 03:59
  • Yes, _and you explicitly changed the binding of `modelf`_! – joran Aug 11 '13 at 04:00
  • look. x <- 0, a <- x + 2, x <- 2, do you expect a to be == 2. that is why there are closures, etc.. – nicolas Aug 11 '13 at 04:04
  • 1
    That's completely different. As I said, everything inside a _function_ is evaluated _only_ when it's called. This behavior will _only_ arise inside functions. – joran Aug 11 '13 at 04:05
  • well allow me to be baffled as to why that is sensible that number and functions follow different rules. – nicolas Aug 11 '13 at 04:06
  • Look yourself: `x <- 1; f <- function() {print(x)}; f(); x <- 2; f()` – joran Aug 11 '13 at 04:06
  • This is called lazy evaluation (in conjunction with lexical scoping) and it is by no means a strange language feature. Maybe [this](http://cran.us.r-project.org/doc/manuals/r-release/R-intro.html#Scope) will help some. – joran Aug 11 '13 at 04:08
  • 1
    there are ways to freeze the definition of `modelf` that is used inside your `modelCurrent` function. One of them is to use a package; the namespace mechanism ensures that modelf from that package will have priority over user-overwrites for internal use. Similarly, you can place `modelf` in a specific environment, and make sure you look there first inside `modelCurrent`. That's the way it works; I can just as easily define `print <- invisible` and screw up my R session for most visible purposes, but that's a powerful feature. – baptiste Aug 11 '13 at 12:33

1 Answers1

10

You need a factory:

modelCurrent = function(mf){
  return(function(inputx,k){
    modelprevious(
      inputx,
      function(res){
        k(res+mf(inputx))
      } # function(res)
      ) # modelprevious
  } # inner function
         ) # return
} # top function

Now you use the factory to create models with the modelf function that you want it to use:

> modelf <- function(x) 2*x
> m1 = modelCurrent(modelf)
> m1(2,id)
[1] 4
> modelf <- function(x) 3*x
> m1(2,id) # no change.
[1] 4

You can always make them on an ad-hoc basis:

> modelCurrent(modelf)(2,id)
[1] 6

and there you can see the factory created a function using the current definition of modelf, so it multiplied by three.

There's one last ginormous WTF!?! that will hit you. Watch carefully:

> modelf <- function(x) 2*x
> m1 = modelCurrent(modelf)
> m1(2,id)
[1] 4
>
> m1 = modelCurrent(modelf) # create a function using the 2* modelf
> modelf <- function(x) 3*x # change modelf...
> m1(2,id) # WTF?!
[1] 6

This is because when the factory is called, mf isn't evaluated - that's because the inner function isn't called, and mf isn't used until the inner function is called.

The trick is to force evaluation of the mf in the outer function, typically using force:

modelCurrent = function(mf){
  force(mf)
  return(function(inputx,k){
    modelprevious(
      inputx,
      function(res){
        k(res+mf(inputx))
      } # function(res)
      ) # modelprevious
  } # inner function
         ) # return
} # top function

This has lead me to premature baldness, because if you forget this and think there's some odd bug going on, and then try sticking print(mf) in place to see what's going on, you'll be evaluating mf and thus getting the behaviour you wanted. By inspecting the data, you changed it! A Heisenbug!

Spacedman
  • 92,590
  • 12
  • 140
  • 224
  • Thank you for laying this down. I was heading there but unsure. the force, as it is implemented, it definitely wrong. lazyness as a feature is interesting, but that is a faulty implementation of it. – nicolas Aug 11 '13 at 18:30
  • regarding the recursion, one last piece is missing here, in that the resulting function will be tied to the current definition of modelprevious, which will be troublesome as we recurse and change is binding. therefore, an extra factory, or local environment is necessary to explicitely capture the current definition of modelprevious. – nicolas Aug 11 '13 at 18:33
  • one way to do this is apparently to add after return a local environment return( local({ cmodelprevious <- modelprevious function(inputx,k){ cmodelprevious( inputx, function(res){ k(res+mf(inputx)) } # function(res) ) # modelprevious } # inner function }) # local – nicolas Aug 11 '13 at 18:34
  • so that the returned function will not look up the definition in the environment prevailing when you call it, but when it was defined. I think that is the big takeaway of all this : you have to define closure yourself and name binding changes the meaning of your program (no 'referential transparency') – nicolas Aug 11 '13 at 18:37
  • from what I see, we only use the function to create an environment. i tend to think that it might be better then to use the "local" keyword, which clearly express the intent, over a factory – nicolas Aug 11 '13 at 18:39
  • 5
    Whatever. I'm curious as to what programming world you come at R from - clearly not LISP. I also suggest you may find this discussion easier on the R-dev mailing list where lots of R people will be interested to hear from you. – Spacedman Aug 11 '13 at 19:04
  • Thank you for your advice, but I just want to merely write out what will be a pitfall for others. – nicolas Aug 11 '13 at 19:55