75

I program regularly in R in a professional context, and I write packages for clients or co-workers as well. Some of the programmers here have a Java background and insist on doing everything the object-oriented way, using S4 methods. My experience on the other hand is that S4 implementations often perform worse and cause a lot more headache when trying to get the code do what you want it to do.

I definitely agree that in some cases, you have to be able to construct complex objects or append existing objects in a controlled manner. But most of the time, S4 implementations can easily be done using classic lists as well, without all the hassle like defining standardGeneric, methods, constructors, initializers and the likes.

When do you consider writing S4 implementations for R?

EDIT : For clarity, I do appreciate the answers and the discussion about OO in general in R. OOP can be done in numerous ways in R, but my question is really aimed at the added value of using S4 methods specifically.

Andrie
  • 176,377
  • 47
  • 447
  • 496
Joris Meys
  • 106,551
  • 31
  • 221
  • 263
  • But S3 is a legitimate object orientation! It is even more modern and flexible sort than S4. – mbq Aug 30 '10 at 20:23
  • @mbq : I don't agree. S3 is far less formal, and can be seen as merely a set of naming conventions. The "Class" is no more than an attribute in fact. S3 still allows a lot of flexibility that is close to impossible with S4 methods. On the other hand, S3 lacks multiple inheritance and formal validation. You can use S3 in an object-oriented manner, but it is not the same as OOP sensu strictu in my eyes. – Joris Meys Aug 30 '10 at 21:14
  • 2
    @Joris Multiple inheritance? Can be done by merging objects and their classes with `c`. Formal validation? No one said OOP must be done with strict typing; Smalltalk is a spectacular example. In general I think OOP is just a manner, and so there is no "canonic" OOP (nevertheless people usually pick they favorite language and say that it defines it). – mbq Aug 30 '10 at 22:12
  • 1
    @mbq : OK, then you come to the point what is OOP and what not. You can easily program the object oriented way by using lists only and setting all attributes manually. _My Colleagues_ , coming from a Java background, call something OOP if it forces you to do it the object oriented way. S3 doesn't, S4 does for them, and I feel it the same. Your mile may vary, but I think you do agree that S3 and S4 are two different beasts. I wanted some ideas on the use of S4, not some semantic discussion about what exactly is OOP in R. – Joris Meys Aug 30 '10 at 22:26
  • 2
    I think part of the problem is that neither S3 nor S4 provide a OO structure that's really along the lines of what someone from a Java/C++ type of world is going to be used to, it's all going to seem foreign to someone versed in that style of OO vs. someone with exposure to Lisp, Dylan, etc. – geoffjentry Aug 31 '10 at 18:44
  • 1
    @geoffjentry Good point! And this does not make them "less OO". – mbq Sep 01 '10 at 12:53
  • Just use closures, seems so simple. – Chris Sep 25 '16 at 03:33

7 Answers7

29

My experience is in line with yours, so I use S3 exclusively.

To clarify: S4 has some slick features (e.g. dispatch on multiple arguments and slot type-checking), but I have not encountered a situation where the features outweighed the costs. Examples of the costs include: any slot change requires a full object copy and (potentially worse) the on-going changes to S4 Methods.

In short, I like the idea behind S4 but I would wait for it to mature before using it in my own code.

Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
28

I'm assuming this doesn't directly apply to you, but if you're developing packages for Bioconductor there's an incentive to use S4 as they actively encourage it's use and have for the better part of a decade now - so all of the core packages make heavy use of S4.

I find all of the extra overhead to be a pain - the setGeneric, setMethod, dealing with NAMESPACE, etc. That being said, I find that the structure that it imposes, potential for extensibility and other such things can be worth it. As with everything, there are tradeoffs involved. I think it can be a lot cleaner - I dislike how S3 methods are simply disguised by naming convention (foo.class). All that being said, I tend to avoid using S4 heavily in my own code unless I'm being told to do so.

geoffjentry
  • 4,674
  • 3
  • 31
  • 37
9

Great question! and I hope it generates some thoughtful discussion...

I've never used it, nor do I intend to for the following reasons:

  1. Performance
  2. I don't have the patience to completely understand S4 and it's relationship to S3.
  3. Syntactic suguar: I'd rather have object.method() than method(object).

I like suguar, what can I say!

Jeff
  • 1,426
  • 8
  • 19
  • 4
    I too don't use S4 because I like `object.method()`. And Google's R style says: "Use S3 objects and methods unless there is a strong reason to use S4 objects or methods. A primary justification for an S4 object would be to use objects directly in C++ code. A primary justification for an S4 generic/method would be to dispatch on two arguments" – Vince Aug 30 '10 at 19:52
  • 13
    FWIW, why is Google the authority on R style? Shouldn't R-core be a higher authority on the matter? (not that R-core seems unified in their viewpoint on the matter, but ...). The fandom of the Google R style guide annoys me a bit for just this reason. – geoffjentry Aug 31 '10 at 17:57
  • 4
    @geoffjentry there's a part of me that feels exactly the say way... however, I'm happy people are thinking about style just a little bit. And if having GOOG logo on the PDF makes some economist (or statistician, etc) read it then I'm all for it. I'm tired of trying to read code that is so hard to parse because of the formating and style. – JD Long Aug 31 '10 at 19:03
  • 3
    Of course, R-core could easily solve this issue if they felt like it so I suppose they're either comfortable with or simply don't care about the Google guidelines. Edit: Judging from this thread (http://tinyurl.com/3ydaa89) where both Dalgaard & Murdoch answer I'm guessing it's more the apathy angle. – geoffjentry Aug 31 '10 at 19:36
9

I learnt S4 in order to extend the Spatial (sp) classes for animal track data. It was the best choice (most consistent, general and closely matching to many GIS definitions) from the available options to avoid writing everything required from scratch. I don't find S4 as onerous as many people say, but I'm now used to exploring the underlying structure of objects like this. The performance is good too, I think it can be done well, though when done poorly there are performance traps.

If spatial data is of interest to you, spatstat is a good example of how to do a lot of similar things to sp in S3, though (as with seemingly everything spatial . . .) there's hardly ever clean analogies between data structures in different softwares.

mdsumner
  • 29,099
  • 6
  • 83
  • 91
6

S4 classes play a strong part in spatial statistics (sensu package sp), where converting from one type of data to the other seems seamless. The pitfall of this is debugging, which has been, in my experience, tedious at best. So far, I have managed with S3 but may consider using S4 in the future.

With time, as things get played around a lot, I believe they will play a strong role in at least core features of various fields of R (may that be spatial analysis, econometrics, environmetrics...)

Roman Luštrik
  • 69,533
  • 24
  • 154
  • 197
  • 2
    In fact, an ever growing part of R is recoded in S4 classes, but I experience more and more problems when using those packages. Documentation is directed at immediate use, but lacks for use in programming. As you know, I also experienced trouble with the evaluation of function arguments using S4 coded methods of a number of packages. So I tend to stay away from them, and I was hoping somebody could show me a good use. – Joris Meys Aug 31 '10 at 08:11
5

Once upon a time, Roxygen2 didn't like S4 methods. As of 2017 (at least), they work together.

I've had the misfortune of creating some functions that needed methods to work with both S3 and S4 classes. It has been incredibly painful to keep this code working over the years as R-core has multiple times changed details on how these systems interact and how Namespaces work and how Rcmd check works.

If you don't like Google's style guide, then consider the comments of these well-known R package developers from this thread on R-help

Frank Harrell "If you love computer science more than you value your own time, use S4."

Terry Therneau wrote: For 90 percent of what I do I strongly prefer the loose (S3) rather than the rigid (S4) classes....My summary of S4 vs S3

S4 has a large increment in: 1. nuisance to write 2. difficulty to debug 3. ability to write very obscure code 4. design

S4 Gains: 5. ability to direct automatic conversions 6. validate the contents of a class object

Kevin Wright
  • 2,397
  • 22
  • 29
5

Don't forget there's also R.oo (on CRAN) which provides a third way of doing OO in R. In my mind this provides an OO system that might be more familiar to programmers migrating from other systems - in particular instead of having generic functions (so that print(foo) then has to dispatch on the class of foo) the methods are tied to the object, so you'd do foo$print() - just as in python or C++ you'd do foo.print().

Spacedman
  • 92,590
  • 12
  • 140
  • 224
  • I've seen this before, but I always wondered what the extra added value was. Apart from semantics, I couldn't find any difference with S3 programming. But honestly, I haven't been digging deeply into it. – Joris Meys Aug 31 '10 at 16:20