10

I'm looking for a very simple example of subclassing a Scala collection. I'm not so much interested in full explanations of how and why it all works; plenty of those are available here and elsewhere on the Internet. I'd like to know the simple way to do it.

The class below might be as simple an example as possible. The idea is, make a subclass of Set[Int] which has one additional method:

class SlightlyCustomizedSet extends Set[Int] {
  def findOdd: Option[Int] = find(_ % 2 == 1)
}

Obviously this is wrong. One problem is that there's no constructor to put things into the Set. A CanBuildFrom object must be built, preferably by calling some already-existing library code that knows how to build it. I've seen examples that implement several additional methods in the companion object, but they're showing how it all works or how to do something more complicated. I'd like to see how to leverage what's already in the libraries to knock this out in a couple lines of code. What's the smallest, simplest way to implement this?

Community
  • 1
  • 1
Ben Kovitz
  • 4,920
  • 1
  • 22
  • 50
  • Is [this](https://stackoverflow.com/questions/4416885/extend-scala-set-with-concrete-type) give you enough cues? – om-nom-nom Jun 19 '14 at 10:05
  • 1
    Alternatively: http://stackoverflow.com/questions/4839566/extending-scala-collections – Siphor Jun 19 '14 at 10:55
  • @om-nom-nom Siphor I've spent several hours working from these hints and more-complex examples and still haven't gotten a version that compiles. A complete, simple example would help enormously. – Ben Kovitz Jun 19 '14 at 16:41
  • @Siphor om-nom-nom Hmm, going one more link led [here](http://stackoverflow.com/questions/16049978) which looks like a simple solution using `SetProxy`. But [`SetProxy`](http://www.scala-lang.org/api/2.11.1/#scala.collection.immutable.SetProxy) has been deprecated! [New question…](http://stackoverflow.com/questions/24312563) – Ben Kovitz Jun 19 '14 at 17:15

1 Answers1

21

If you just want to add a single method to a class, then subclassing may not be the way to go. Scala's collections library is somewhat complicated, and leaf classes aren't always amenable to subclassing (one might start by subclassing HashSet, but this would start you on a journey down a deep rabbit hole).

Perhaps a simpler way to achieve your goal would be something like:

implicit class SetPimper(val s: Set[Int]) extends AnyVal {
  def findOdd: Option[Int] = s.find(_ % 2 == 1)
}

This doesn't actually subclass Set, but creates an implicit conversion that allows you to do things like:

Set(1,2,3).findOdd // Some(1)

Down the Rabbit Hole

If you've come from a Java background, it might be surprising that it's so difficult to extend standard collections - after all the Java standard library's peppered with j.u.ArrayList subclasses, for pretty much anything that can contain other things. However, Scala has one key difference: its first-choice collections are all immutable.

This means that they don't have add methods that modify them in-place. Instead, they have + methods that construct a new instance, with all the original items, plus the new item. If they'd implemented this naïvely, it'd be very inefficient, so they use various class-specific tricks to allow the new instances to share data with the original one. The + method may even return an object of a different type to the original - some of the collections classes use a different representation for small or empty collections.

However, this also means that if you want to subclass one of the immutable collections, then you need to understand the guts of the class you're subclassing, to ensure that your instances of your subclass are constructed in the same way as the base class.

By the way, none of this applies to you if you want to subclass the mutable collections. They're seen as second class citizens in the scala world, but they do have add methods, and rarely need to construct new instances. The following code:

class ListOfUsers(users: Int*) extends scala.collection.mutable.HashSet[Int] {
  this ++= users

  def findOdd: Option[Int] = find(_ % 2 == 1)
}

Will probably do more-or-less what you expect in most cases (map and friends might not do quite what you expect, because of the the CanBuildFrom stuff that I'll get to in a minute, but bear with me).

The Nuclear Option

If inheritance fails us, we always have a nuclear option to fall back on: composition. We can create our own Set subclass that delegates its responsibilities to a delegate, as such:

import scala.collection.SetLike
import scala.collection.mutable.Builder
import scala.collection.generic.CanBuildFrom

class UserSet(delegate: Set[Int]) extends Set[Int] with SetLike[Int, UserSet] {
    override def contains(key: Int) = delegate.contains(key)
    override def iterator = delegate.iterator
    override def +(elem: Int) = new UserSet(delegate + elem)
    override def -(elem: Int) = new UserSet(delegate - elem)
    override def empty = new UserSet(Set.empty)
    override def newBuilder = UserSet.newBuilder
    override def foreach[U](f: Int => U) = delegate.foreach(f) // Optional
    override def size = delegate.size // Optional
}

object UserSet {
    def apply(users: Int*) = (newBuilder ++= users).result()
    def newBuilder = new Builder[Int, UserSet] {
        private var delegateBuilder = Set.newBuilder[Int]
        override def +=(elem: Int) = {
            delegateBuilder += elem
            this
        }
        override def clear() = delegateBuilder.clear()
        override def result() = new UserSet(delegateBuilder.result())
    }

    implicit object UserSetCanBuildFrom extends CanBuildFrom[UserSet, Int, UserSet] {
        override def apply() = newBuilder
        override def apply(from: UserSet) = newBuilder
    }
}

This is arguably both too complicated and too simple at the same time. It's far more lines of code than we meant to write, and yet, it's still pretty naïve.

It'll work without the companion class, but without CanBuildFrom, map will return a plain Set, which may not be what you expect. We've also overridden the optional methods that the documentation for Set recommends we implement.

If we were being thorough, we'd have created a CanBuildFrom, and implemented empty for our mutable class, as this ensures that the handful of methods that create new instances will work as we expect.

But that sounds like a lot of work...

If that sounds like too much work, consider something like the following:

case class UserSet(users: Set[Int])

Sure, you have to type a few more letters to get at the set of users, but I think it separates concerns better than subclassing.

James_pic
  • 3,240
  • 19
  • 24
  • I don't want to add `findOdd` to any `Set[Int]`. I'm looking for a simple example of genuinely subclassing `Set[Int]`. That would enable me to override methods, pattern-match based on class, make sibling subclasses that define `findOdd` differently, and everything else that you can do with a genuine subclass. (BTW, please don't delete your answer. "Pimping" is such an important technique in Scala, it's easy to think it answers my question. Others have previously posted this answer and deleted it.) – Ben Kovitz Jun 19 '14 at 16:04
  • What's your use case? `Set` is a trait, rather than a class, so you wouldn't inherit a working implementation by subclassing it, although you could mix it into leaf classes like `HashSet` or `TreeSet`. But then you end up down the rabbit hole I mentioned, since these classes aren't designed to be subclassed. For example, `HashSet(1)` doesn't actually return a `HashSet`, but a `HashSet.HashSet1` - a private subclass of it, produced by private factory methods in `HashSet`. – James_pic Jun 19 '14 at 16:06
  • Sorry, I was slightly mistaken. `HashSet.HashSet1` isn't a private class, and most of the factory methods aren't private either. However, the wider problem remains, that you'd need to override these subclasses, and the various factory methods that produce them. – James_pic Jun 19 '14 at 16:17
  • I've got lots of uses for subclassing collections, but they're not simple. So far, I've always coded up a less-than-ideal workaround like pimping or holding the `Set` in a public `val` and adding `isXxxSet` methods. I'd like to learn how to actually subclass a collection. Given a simple but complete example—that sidesteps the rabbit-holes—I'm sure I could extend it appropriately the next time this comes up. Or it would might suggest fruitful ideas as only an example can. – Ben Kovitz Jun 19 '14 at 16:26
  • Thanks for the explanation of `HashSet.HashSet1`, though! This definitely sheds some light on why the compiler warns against subclassing `HashSet` and why my previous attempts to extend a `Set` implementation hit a brick wall (or fell down a rabbit-hole) very fast. – Ben Kovitz Jun 19 '14 at 16:32
  • I guess part of the reason I wanted to know your use case, is that part of me suspects your trying to solve the wrong problem. In OO design, you'd usually have each class doing as few things as possible. Having a collection that's also something else seems like a code smell, which may be why the scale devs didn't spend time making leaf classes subclassable – James_pic Jun 19 '14 at 17:11
  • I may well be trying to solve the wrong problem. But subclassing is such a basic OO technique, I can't believe Scala wouldn't support it. I've had lots of success subclassing standard collections in Python using the one-liner approach above. Scala provides even more uses for subclassing, like pattern-matching based on class. I've already had success in Scala doing things like subclassing without even adding or overriding anything. Scala is so friendly to code reuse, I figure there has to be a nice way to reuse the code in collection classes. – Ben Kovitz Jun 19 '14 at 17:28
  • you can't see this, but prior to your answer [there was yet another](http://take.ms/0azZu). It is deleted now, as it's not what op wants – om-nom-nom Jun 19 '14 at 18:32
  • I've added some detail on precisely what's down the rabbit hole (spoiler: it's not rabbits), and some approaches that could get you what you want. – James_pic Jun 19 '14 at 19:00
  • Thanks for the greatly expanded answer—or rather, menu of answers complete with trade-offs. Directly inheriting from a mutable `HashSet` is a new one for me. After some experimentation, I posted a small correction; please double-check that I got it right. – Ben Kovitz Jun 19 '14 at 22:05