1

In my package, I define %+% operator as a shortcut for strings concatenation. As it may be defined by previously loaded packages, I want to execute my custom code only when both arguments are suitable (e.g. character), otherwise try to call the code from previously loaded packages. Here is my solution for that:

# helper function to find environment of the package
getEnvByName <- function(inpEnv=.GlobalEnv, lookFor){
  e <- inpEnv;
  while (environmentName(e) != 'R_EmptyEnv' & environmentName(e)!=lookFor) e <- parent.env(e);
  if (environmentName(e) != lookFor) return(NULL);
  return(e);
}

"%+%" <- function(arg1, arg2){
  if (is.character(arg1) & is.character(arg2)) {
    paste0(arg1, arg2);
  } else {
    e <- parent.env(getEnvByName(.GlobalEnv,'package:mypackagename'));
    if (exists('%+%', envir = e)) get('%+%',envir = e)(arg1,arg2);
  } 
}

My questions are:
1) is it a good way to treat such situations?
2) why it is not the common practice to do similar things in other packages? For example, in the ggplot2 package, %+% operator is defined as following:

"%+%" <- function (e1, e2) 
{
    e2name <- deparse(substitute(e2))
    if (is.theme(e1)) add_theme(e1, e2, e2name)
    else if (is.ggplot(e1)) add_ggplot(e1, e2, e2name)
}

as you see, their code breaks previously defined %+% for any arguments while they could just override it only for theme or ggplot arguments and keep all other cases. I could suggest the authors to implement this kind of check but I assume there's some reason they don't do it...

UPD. just a little modification of my code: instead of defining everything in one function, I split it with UseMethod() - I'm wondering if it makes any difference:

`%+%` <- function(...) UseMethod("%+%")
`%+%.character` <- paste0
`%+%.default` <- function (arg1, arg2){
  e <- parent.env(getEnvByName(.GlobalEnv,'package:mypackagename'));
  get('%+%',envir = e)(arg1,arg2);
}
Vasily A
  • 8,256
  • 10
  • 42
  • 76
  • i think the standard approach is to define a generic function, and your specific method. Since you want dispatch to happen for more than one argument, that would probably require S4. – baptiste Jan 22 '17 at 02:00
  • not sure I understand it correctly :/ could you explain a bit more? Thanks!!! – Vasily A Jan 22 '17 at 02:08
  • to be honest I'm not qualified to answer, that was just a pointer in the direction of S4 classes. – baptiste Jan 22 '17 at 02:17
  • ok, thanks for your input! I will check that. – Vasily A Jan 22 '17 at 02:18
  • Why not just make it `%s+%` or import that from `stringi`? Also, this may be a dup http://stackoverflow.com/questions/4730551/making-a-string-concatenation-operator-in-r – hrbrmstr Jan 22 '17 at 03:36
  • 1
    well, for me it's not really a dup - I have read that question. Solution with redefining `+` works indeed but I wanted to avoid modifying operator that is so widely used. `%s+%` etc works as well but for me it's rather a workaround, while my question is more generally about the case when we really want to keep the same name... – Vasily A Jan 22 '17 at 04:13
  • I'm guessing the answer lies somewhere around `setGenericImplicit`, `setOldClass`, and some other tricks to make a S3-S4 bridge, but there seem to be few people using S4 on these forums. You may have better luck on the R-help list. – baptiste Jan 22 '17 at 20:06

1 Answers1

1

First of all I don't think it is a good practice to reimplement functions that already exist in widely used package (I refer to previously mentioned %s+% from stringi).

As for about you question I think the best way is this:

'%+%' <- function(arg1, arg2){
  if (is.character(arg1) & is.character(arg2)) {
    paste0(arg1, arg2)
  } else {
    old.func <- get('%+%',
                    envir = parent.env(.GlobalEnv),
                    inherits = TRUE)
    old.func(arg1, arg2)
  } 
}
  1. With option inherits = TRUE (which is default by the way) get performs the same search in environments as is implemented in your answer;
  2. The method with UseMethod will work differently because in that case %+% will check only the first argument for the type "character", not both arguments;
  3. As for ggplot2s %+% I think it was intended to return NULL with not suitable arguments' type. It might possibly be a flaw in the code.
echasnovski
  • 1,161
  • 8
  • 13
  • 1. You're right that I don't need to loop through all environments (it's actually piece of a code I used to locate the specific environment where the function was defined first), I will probably remove this part from my question. However, your code is not totally correct either - it finds the function itself and goes to infinite recursion. Instead of `parent.env(.GlobalEnv)`, it should be `parent.env(parent.env(.GlobalEnv))`. – Vasily A Jan 22 '17 at 21:11
  • 2. Checking the `character` type is kinda obvious, I meant if there's any other difference. Imagine if my first variant I only check the first argument - what will be the difference then. 3. Not sure if returning `NULL` is intended by purpose, but anyway - that's what breaks the functioning of previously loaded functions, and it's so easy to avoid, so my main question is - what is the reason of not doing that. – Vasily A Jan 22 '17 at 21:13
  • I checked on two machines (on Windows 7 and Ubuntu 16.04) and my code with only one `parent.env` returns function `%+%` from `ggplot2` (if this package loaded, of course). Strange... – echasnovski Jan 23 '17 at 14:34
  • maybe it's because you define it not from the package – Vasily A Jan 23 '17 at 16:51
  • Yes, I think that is the reason. But then the method with two `parent.env` also not very good because this package can be not the last loaded. There is a need for using logic that omits current package in consecutive search up through environments. – echasnovski Jan 23 '17 at 18:17
  • ouch, you're right, thanks for pointing out! I have added the code to find the environment of my package (not sure if it's a good solution but I couldn't find anything like `getEnvByName()` functionality). – Vasily A Jan 23 '17 at 22:33