5

The 'sparcl' package uses the 'kmeans' function in the standard 'stat' package. I want to make it use my own implementation of kmeans++ instead.

One way to do this would be to edit the code in the sparcl package itself. I'd prefer to avoid this, both because it would be messy and because I'm not sure how I would then install that edited code in R.

Unfortunately, the superassignment operator "<<-" doesn't work:

> kmeans <<- function(x) print("hi!")
Error: cannot change value of locked binding for 'kmeans'

neither does "assign":

assign("kmeans",function(x) {print("HI THERE!"); return(FALSE)},pos="package:sparcl")
Error in assign("is.null", function(x) { : 
  cannot add bindings to a locked environment

So is editing the package code the only way?

Thanks!

martin
  • 185
  • 9
  • 1
    Could you use `trace` for this? – Ari B. Friedman Oct 05 '12 at 14:52
  • What is the reason you want to replace the old one instead of just writing a new version and using that one instead? – Dason Oct 05 '12 at 14:53
  • 4
    Wouldn't it be easier to create your own version of the (exported) function that calls `kmeans` and alter those instances so that they call your own custom function? (Maybe that's what @Dason was saying as well...?) – joran Oct 05 '12 at 15:01
  • I vote for @joran's suggestion. And if you do want to package up the function, do so by creating your own mini-package (that perhaps imports from stats) rather than by editing the stats package's source code! – Josh O'Brien Oct 05 '12 at 15:56
  • @joran's suggestion isn't quite perfect, since `kmeans` is called multiple times in multiple functions in multiple files in the sparcl package, so I'd have to copy and edit all those as well... But that's how I was planning to do it initially. – martin Oct 05 '12 at 16:03

2 Answers2

7

If you do want to edit a function's body (but not its arguments) during an interactive session, you can use trace(), like this:

trace("kmeans", edit=TRUE)

Then, in the editor that pops up, edit the body, so that it looks like this (for example):

function (x, centers, iter.max = 10, nstart = 1, algorithm = c("Hartigan-Wong", 
"Lloyd", "Forgy", "MacQueen")) 
{
    plot(rnorm(99), col = "red")
}

Save the edited function definition and then exit the editor.

Back at the R command line, you can view the edited function and try it out:

body(kmeans)  # To view the tracing code
kmeans()      # To use the edited function

Finally, to revert to the unedited function, just do untrace("kmeans"). (I generally prefer using trace() to assignInNamespace() and friends because untrace() makes it so easy to undo changes.)

Josh O'Brien
  • 159,210
  • 26
  • 366
  • 455
  • Thanks, I didn't know about trace(). Though I don't think it completely solves my problem because of the large number of calls the sparcl package makes to kmeans in different functions... – martin Oct 05 '12 at 16:07
  • 1
    @martas Maybe just grab the source code for the package, with a good text editor altering all instances of `kmeans` calls won't take too much time, and then rebuild the package locally. I think folks are just a bit leery of a solution that involves overwriting the stats package code itself, which you might regret later. – joran Oct 05 '12 at 16:12
  • 2
    `trace()` edits the function in `namespace:stats`, so it will edit the version of the `kmeans()` seen by any function that calls it. Isn't that what you want, or am I confused? (Not mutually exclusive possibilities, I know...) – Josh O'Brien Oct 05 '12 at 16:13
  • Mother of god, it (your edit) works! Thank you. I suppose this works because kmeans isn't part of a package the sparcl code explicitly imported, right? I'm still very ignorant about the way R handles namespaces... – martin Oct 05 '12 at 17:16
  • @martas -- Oops. You may want to move the 'Accept' over to my other answer. (I decided it was so much different than the previous answer that I should split it up. I would have overwritten the previous answer re: trace, but it looks like it may be useful on its own.) – Josh O'Brien Oct 05 '12 at 17:17
2

On further thought (and after re-reading your question), here's a simple solution that should work for you.

All you need to do is to assign your edited version of kmeans() to the symbol kmeans in the global environment. In other words, at the command line do this:

kmeans <- function(...) plot(rnorm(99), col="red") # but using your own edits

## Then run an example from ?KMeansSparseCluster to see that it works.
library(sparcl)
x <- matrix(rnorm(50*300),ncol=300)
x[1:25,1:50] <- x[1:25,1:50]+1
x <- scale(x, TRUE, TRUE)
KMeansSparseCluster.permute(x,K=2,wbounds=seq(3,9,len=15),nperms=5)

This works because KMeansSparseCluster() (and calls to any other functions in package:sparcl) look for kmeans first in namespace:sparcl, then in imports:sparcl, then in namespace:base, and then in .GlobalEnv, where it'll find your redefined kmeans before it gets to the one in package:stats. To have a look yourself, try this:

parent.env(asNamespace("sparcl"))
parent.env(parent.env(asNamespace("sparcl")))
parent.env(parent.env(parent.env(asNamespace("sparcl"))))
## etc., also wrapping any of the environments above in calls to ls() 
## to see what's in 'em

Nicely, functions from the stats package that use kmeans() won't be disrupted by your version, because they will find kmeans in their own namespace, before the symbol-search ever gets to the global environment.

Josh O'Brien
  • 159,210
  • 26
  • 366
  • 455
  • @JoshOBrien: Isn't it possible that the package specifically imports `kmeans` from the stats package, so it would be in `imports:sparcl`? That would be the "Right" way to do it, yes? – Aaron left Stack Overflow Oct 05 '12 at 18:35
  • @Aaron -- I agree: it would be the right way to do it, but it's not what sparcl's author(s) did. I checked `imports:sparcl`, and it's empty. Your larger point is a good one, though. This solution wouldn't work if the package authors had put `import("stats")` in their NAMESPACE file. – Josh O'Brien Oct 05 '12 at 18:41
  • Thanks for the update, Josh. Nice thinking to check what was in `imports:sparcl`. – Aaron left Stack Overflow Oct 05 '12 at 19:34