81

Object oriented programming in one way or another is very much possible in R. However, unlike for example Python, there are many ways to achieve object orientation:

My question is:

What major differences distinguish these ways of OO programming in R?

Ideally the answers here will serve as a reference for R programmers trying to decide which OO programming methods best suits their needs.

As such, I am asking for detail, presented in an objective manner, based on experience, and backed with facts and reference. Bonus points for clarifying how these methods map to standard OO practices.

Paul Hiemstra
  • 59,984
  • 12
  • 142
  • 149
  • 1
    Info on Reference Classes: http://stackoverflow.com/questions/5137199/what-is-the-significance-of-the-new-reference-classes – Ari B. Friedman Mar 01 '12 at 18:29
  • Thanks, could you repost the link as answer? It would be nice if you could include a small summary of what Reference classes are, and why they are preferable in relation to S3/S4 classes. – Paul Hiemstra Mar 01 '12 at 18:34
  • A little bird whispered into my ear that a book on this will be forthcoming by John Chambers. But don't tell anyone I said that ... ;-) – Dirk Eddelbuettel Mar 01 '12 at 18:41
  • 1
    Could that same little birdy paste an answer below with some more info on Refenence classes ;) – Paul Hiemstra Mar 01 '12 at 18:43

3 Answers3

34

S3 classes

  • Not really objects, more of a naming convention
  • Based around the . syntax: E.g. for print, print calls print.lm print.anova, etc. And if not found,print.default

S4 classes

Reference classes

proto

  • ggplot2 was originally written in proto, but will eventually be rewritten using S3.
  • Neat concept (prototypes, not classes), but seems tricky in practice
  • Next version of ggplot2 seems to be moving away from it
  • Description of the concept and implementation

R6 classes

  • By-reference
  • Does not depend on S4 classes
  • "Creating an R6 class is similar to the reference class, except that there’s no need to separate the fields and methods, and you can’t specify the types of the fields."
Community
  • 1
  • 1
Ari B. Friedman
  • 71,271
  • 35
  • 175
  • 235
  • 1
    Feel free to edit if you have other differences to add in. I'm not going to cry if it becomes CW :-) – Ari B. Friedman Mar 01 '12 at 19:45
  • 3
    don't forget `library("fortunes"); fortune("strait")` – Ben Bolker Mar 01 '12 at 21:01
  • 1
    A discussion on S4 classes here: http://stackoverflow.com/questions/3602154/when-does-it-pay-off-to-use-s4-methods-in-r-programming. The general feeling seems to be that they are more trouble then they deliver an advantage. – Paul Hiemstra Mar 01 '12 at 22:48
  • Interestingly, the new R6 classes implicitly acknowledge that Reference Classes were R5 by avoiding using that number. Let the controversy begin (anew). – Ari B. Friedman Jul 25 '14 at 13:53
  • 1
    The name R5 was originally used as a joke by people other than the developers of Reference Classes. The name R6 is an acknowledgement of "R5", but it isn't meant to imply that the name R5 had any official endorsement. – wch Jul 30 '14 at 00:34
  • @wch I'm aware. Just stirring the pot ;-) – Ari B. Friedman Jul 30 '14 at 01:45
19

Edit on 3/8/12: The answer below responds to a piece of the originally posted question which has since been removed. I've copied it below, to provide context for my answer:

How do the different OO methods map to the more standard OO methods used in e.g. Java or Python?


My contribution relates to your second question, about how R's OO methods map to more standard OO methods. As I've thought about this in the past, I've returned again and again to two passages, one by Friedrich Leisch, and the other by John Chambers. Both do a good job of articulating why OO-like programming in R has a different flavor than in many other languages.

First, Friedrich Leisch, from "Creating R Packages: A Tutorial" (warning: PDF):

S is rare because it is both interactive and has a system for object-orientation. Designing classes clearly is programming, yet to make S useful as an interactive data analysis environment, it makes sense that it is a functional language. In "real" object-oriented programming (OOP) languages like C++ or Java class and method definitions are tightly bound together, methods are part of classes (and hence objects). We want incremental and interactive additions like user-defined methods for pre-defined classes. These additions can be made at any point in time, even on the fly at the command line prompt while we analyze a data set. S tries to make a compromise between object orientation and interactive use, and although compromises are never optimal with respect to all goals they try to reach, they often work surprisingly well in practice.

The other passage comes from John Chambers' superb book "Software for Data Analysis". (Link to quoted passage):

The OOP programming model differs from the S language in all but the first point, even though S and some other functional languages support classes and methods. Method definitions in an OOP system are local to the class; there is no requirement that the same name for a method means the same thing for an unrelated class. In contrast, method definitions in R do not reside in a class definition; conceptually, they are associated with the generic function. Class definitions enter in determining method selection, directly or through inheritance. Programmers used to the OOP model are sometimes frustrated or confused that their programming does not transfer to R directly, but it cannot. The functional use of methods is more complicated but also more attuned to having meaningful functions, and can't be reduced to the OOP version.

Josh O'Brien
  • 159,210
  • 26
  • 366
  • 455
14

S3 and S4 seem to be the official (i.e. built in) approaches for OO programming. I have begun using a combination of S3 with functions embedded in constructor function/method. My goal was to have a object$method() type syntax so that I have semi-private fields. I say semi-private because there is no way of really hiding them (as far as I know). Here is a simple example that doesn't actually do anything:

#' Constructor
EmailClass <- function(name, email) {
    nc = list(
        name = name,
        email = email,
        get = function(x) nc[[x]],
        set = function(x, value) nc[[x]] <<- value,
        props = list(),
        history = list(),
        getHistory = function() return(nc$history),
        getNumMessagesSent = function() return(length(nc$history))
    )
    #Add a few more methods
    nc$sendMail = function(to) {
        cat(paste("Sending mail to", to, 'from', nc$email))
        h <- nc$history
        h[[(length(h)+1)]] <- list(to=to, timestamp=Sys.time())
        assign('history', h, envir=nc)
    }
    nc$addProp = function(name, value) {
        p <- nc$props
        p[[name]] <- value
        assign('props', p, envir=nc)
    }
    nc <- list2env(nc)
    class(nc) <- "EmailClass"
    return(nc)
}

#' Define S3 generic method for the print function.
print.EmailClass <- function(x) {
    if(class(x) != "EmailClass") stop();
    cat(paste(x$get("name"), "'s email address is ", x$get("email"), sep=''))
}

And some test code:

    test <- EmailClass(name="Jason", "jason@bryer.org")
    test$addProp('hello', 'world')
    test$props
    test
    class(test)
    str(test)
    test$get("name")
    test$get("email")
    test$set("name", "Heather")
    test$get("name")
    test
    test$sendMail("jbryer@excelsior.edu")
    test$getHistory()
    test$sendMail("test@domain.edu")
    test$getNumMessagesSent()

    test2 <- EmailClass("Nobody", "dontemailme@nowhere.com")
    test2
    test2$props
    test2$getHistory()
    test2$sendMail('nobody@exclesior.edu')

Here is a link to a blog post I wrote about this approach: http://bryer.org/2012/object-oriented-programming-in-r I would welcome comments, criticisms, and suggestions to this approach as I am not convinced myself if this is the best approach. However, for the problem I was trying to solve it has worked great. Specifically, for the makeR package (http://jbryer.github.com/makeR) I did not want users to change data fields directly because I needed to ensure that an XML file that represented my object's state would stay in sync. This worked perfectly as long as the users adhere to the rules I outline in the documentation.

jbryer
  • 1,747
  • 3
  • 16
  • 29