47

I was wondering how one might go about writing a string concatenation operator in R, something like || in SAS, + in Java/C# or & in Visual Basic.

The easiest way would be to create a special operator using %, like

`%+%` <- function(a, b) paste(a, b, sep="")

but this leads to lots of ugly %'s in the code.

I noticed that + is defined in the Ops group, and you can write S4 methods for that group, so perhaps something like that would be the way to go. However, I have no experience with S4 language features at all. How would I modify the above function to use S4?

Hong Ooi
  • 56,353
  • 13
  • 134
  • 187

5 Answers5

47

As others have mentioned, you cannot override the sealed S4 method "+". However, you do not need to define a new class in order to define an addition function for strings; this is not ideal since it forces you to convert the class of strings and thus leading to more ugly code. Instead, one can simply overwrite the "+" function:

"+" = function(x,y) {
    if(is.character(x) || is.character(y)) {
        return(paste(x , y, sep=""))
    } else {
        .Primitive("+")(x,y)
    }
}

Then the following should all work as expected:

1 + 4
1:10 + 4 
"Help" + "Me"

This solution feels a bit like a hack, since you are no longer using formal methods but its the only way to get the exact behavior you wanted.

Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
statsmaths
  • 486
  • 5
  • 3
  • 3
    I'm not familiar with S3/4 that much - what exactly is hacky about this? Seem to work pretty well. – eddi Sep 27 '13 at 19:55
  • 1
    This is an old post but I have a feeling a lot of people still look at it. I would make one suggestion to make the use of this function more constrained. Change the logical operator in the if statement to &. I can't think of a good reason why you'd want to concat two objects if you're only sure one of them is a string. The resulting error message would not be as intuitive as compared to the error message you'd get from the primitive +. – Josh Bradley Jun 16 '15 at 21:01
  • 1
    @JoshBradley: I reverted your edit because there was no error if different object types were concatenated. The non-character argument is promoted to character, which is consistent with many other R functions. R is not a strongly typed language and `Primitive("+")` does allow mixed types (e.g. `TRUE+1L`, `1L+1.0`). I suggest you add your own answer, rather than changing the accepted answer. – Joshua Ulrich Jun 17 '15 at 00:42
  • 1
    This answer break several other packages that define their own methods for the plus operator. Most prominently ggplot2. I'd advise against its usage – OganM Oct 06 '15 at 00:32
  • 6
    It does not break ggplot2 for me. ggplot2 defines its "+" as an S3 method for class "gg". See `?ggplot2::`%+%``. – CoderGuy123 Jan 21 '16 at 12:21
  • 1
    will this hack possibly slow down some other code, that might use "+" operator? – Maksim Gayduk May 31 '16 at 10:33
  • I retract my previous statement about ggplot but it does slow down execution by about 6 percent. Try `microbenchmark({1:10000 + 1:10000})` with primitive `+` and with this `+`. So I was reluctant to add it to my default pipeline – OganM Dec 14 '16 at 23:23
  • Great ! you can even write `"foo"+"1"+0` with 0 as numeric – Dan Chaltiel Aug 28 '17 at 10:31
29

I'll try this (relatively more clean S3 solution)

`+` <- function (e1, e2) UseMethod("+")
`+.default` <- function (e1, e2) .Primitive("+")(e1, e2)
`+.character` <- function(e1, e2) 
    if(length(e1) == length(e2)) {
           paste(e1, e2, sep = '')
    } else stop('String Vectors of Different Lengths')

Code above will change + to a generic, and set the +.default to the original +, then add new method +.character to +

Pierre L
  • 28,203
  • 6
  • 47
  • 69
Lytze
  • 755
  • 6
  • 12
25

You can also use S3 classes for this:

String <- function(x) {
  class(x) <- c("String", class(x))
  x
}

"+.String" <- function(x,...) {
  x <- paste(x, paste(..., sep="", collapse=""), sep="", collapse="")
  String(x)
}


print.String <- function(x, ...) cat(x)

x <- "The quick brown "
y <- "fox jumped over "
z <- "the lazy dog"

String(x) + y + z
jverzani
  • 5,600
  • 2
  • 21
  • 17
13

If R would thoroghlly comply with S4, the following would have been enough:

setMethod("+",
          signature(e1 = "character", e2 = "character"),
          function (e1, e2) {
              paste(e1, e2, sep = "")
      })

But this gives an error that the method is sealed :((. Hopefully this will change in the feature versions of R.

The best you can do is to define new class "string" which would behave exactly as "character" class:

setClass("string", contains="character")
string <- function(obj) new("string", as.character(obj))

and define the most general method which R allows:

setMethod("+", signature(e1 = "character", e2 = "ANY"),
          function (e1, e2) string(paste(e1, as.character(e2), sep = "")))

now try:

tt <- string(44444)

tt
#An object of class "string"
#[1] "44444"
tt + 3434
#[1] "444443434"
"sfds" + tt
#[1] "sfds44444"
tt +  tt
#[1] "4444444444"
343 + tt
#Error in 343 + tt : non-numeric argument to binary operator
"sdfs" + tt + "dfsd"
#An object of class "string"
#[1] "sdfs44444dfsd"
VitoshKa
  • 8,387
  • 3
  • 35
  • 59
9

You have given yourself the correct answer -- everything in R is a function, and you cannot define new operators. So %+% is as good as it gets.

Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725
  • 3
    But you can redefine the behavior of existing operators. Not in this case though because the "+" methods is sealed for signature c("character", "character"). – VitoshKa Jan 19 '11 at 09:26