23

as a beginner in Scala - functional way, I'm little bit confused about whether should I put functions/methods for my case class inside such class (and then use things like method chaining, IDE hinting) or whether it is more functional approach to define functions outside the case class. Let's consider both approaches on very simple implementation of ring buffer:

1/ methods inside case class

case class RingBuffer[T](index: Int, data: Seq[T]) {
  def shiftLeft: RingBuffer[T] = RingBuffer((index + 1) % data.size, data)
  def shiftRight: RingBuffer[T] = RingBuffer((index + data.size - 1) % data.size, data)
  def update(value: T) = RingBuffer(index, data.updated(index, value))
  def head: T = data(index)
  def length: Int = data.length
}

Using this approach, you can do stuff like methods chaining and IDE will be able to hint methods in such case:

val buffer = RingBuffer(0, Seq(1,2,3,4,5))  // 1,2,3,4,5
buffer.head   // 1
val buffer2 = buffer.shiftLeft.shiftLeft  // 3,4,5,1,2
buffer2.head // 3

2/ functions outside case class

case class RingBuffer[T](index: Int, data: Seq[T])

def shiftLeft[T](rb: RingBuffer[T]): RingBuffer[T] = RingBuffer((rb.index + 1) % rb.data.size, rb.data)
def shiftRight[T](rb: RingBuffer[T]): RingBuffer[T] = RingBuffer((rb.index + rb.data.size - 1) % rb.data.size, rb.data)
def update[T](value: T)(rb: RingBuffer[T]) = RingBuffer(rb.index, rb.data.updated(rb.index, value))
def head[T](rb: RingBuffer[T]): T = rb.data(rb.index)
def length[T](rb: RingBuffer[T]): Int = rb.data.length

This approach seems more functional to me, but I'm not sure how practical it is, because for example IDE won't be able to hint you all possible method calls as using methods chaining in previous example.

val buffer = RingBuffer(0, Seq(1,2,3,4,5))  // 1,2,3,4,5
head(buffer)  // 1
val buffer2 = shiftLeft(shiftLeft(buffer))  // 3,4,5,1,2
head(buffer2) // 3

Using this approach, the pipe operator functionality can make the above 3rd line more readable:

implicit class Piped[A](private val a: A) extends AnyVal {
  def |>[B](f: A => B) = f( a )
}

val buffer2 = buffer |> shiftLeft |> shiftLeft

Can you please summarize me your own view of advance/disadvance of particular approach and what's the common rule when to use which approach (if any)?

Thanks a lot.

xwinus
  • 886
  • 3
  • 12
  • 28
  • 4
    This is too primarly opinion based IMO. You can also define a third approach where the methods are implicitly defined (perhaps in the companion of such a class). Once they're available in scope you'll get IDE completion as if they were define on the class itself. For case classes, I usually like to define the methods in their *companion object*, keeping the class itself as clean as possible. – Yuval Itzchakov Jun 25 '16 at 08:57
  • @YuvalItzchakov can you please provide some very simple example of your way using the companion object? In case of my ring buffer implementation, you would still be forced to pass case class instance as an argument to such method/function defined in the companion object (e.g. `def head[T](rb: RingBuffer[T])`), right? – xwinus Jun 25 '16 at 09:08
  • 2
    Depends on how you create these methods. If you use an `implicit class` for example, you can use these methods like extension methods. See [this](http://scastie.org/20579) example. – Yuval Itzchakov Jun 25 '16 at 10:09

2 Answers2

8

In this particular example, the first approach has much more benefits than the second one. I would go with adding all the methods inside the case class.

Here is an example on an ADT where decoupling the logic from the data has some benefits:

sealed trait T
case class X(i: Int) extends T
case class Y(y: Boolean) extends T

Now you can keep adding logic without needing to change your data.

def foo(t: T) = t match {
   case X(a) => 1
   case Y(b) => 2 
}

In addition, all the logic of foo() is concentrated in a single block, which makes it easy to see how it operates on X and Y (compared to X and Y having their own version of foo).

In most programs, logic changes much more often than the data, so this approach allows you to add extra logic without ever needing to change/modify existing code (less bugs, less chance of breaking existing code).

Adding code into the companion object

Scala gives a lot of flexibility in how you add logic to a class using implicit conversions and the concept of Type Classes. Here are some basic ideas borrowed from ScalaZ. In this example, the data (case class) remains just data and all the logic is added in the companion object.

// A generic behavior (combining things together)
trait Monoid[A] {
  def zero: A
  def append(a: A, b: A): A
}

// Cool implicit operators of the generic behavior
trait MonoidOps[A] {
    def self: A
    implicit def M: Monoid[A]
    final def ap(other: A) = M.append(self,other)
    final def |+|(other: A) = ap(other)
}
 
object MonoidOps {
     implicit def toMonoidOps[A](v: A)(implicit ev: Monoid[A]) = new MonoidOps[A] {
       def self = v
       implicit def M: Monoid[A] = ev
    }
}


// A class we want to add the generic behavior 
case class Bar(i: Int)

object Bar {
  implicit val barMonoid = new Monoid[Bar] {
     def zero: Bar = Bar(0)
     def append(a: Bar, b: Bar): Bar = Bar(a.i + b.i)
  }
}

You can then use these implicit operators:

import MonoidOps._
Bar(2) |+| Bar(4)  // or Bar(2).ap(Bar(4))
res: Bar = Bar(6)

Or use Bar in generic functions build around, say, the Monoid Type Class.

def merge[A](l: List[A])(implicit m: Monoid[A]) = l.foldLeft(m.zero)(m.append)

merge(List(Bar(2), Bar(4), Bar(2)))
res: Bar = Bar(10)
zmerr
  • 534
  • 3
  • 18
marios
  • 8,874
  • 3
  • 38
  • 62
2

There're arguments both against the "functions outside a class" approach, eg https://www.martinfowler.com/bliki/AnemicDomainModel.html and for: eg "Functional and Reactive Domain Modelling" by D. Ghosh (ch. 3). (See also https://underscore.io/books/essential-scala/ ch. 4.) In my experience, the former approach is preferable, with a few exceptions. Some of its advantages are:

  • Easier to focus on data only or on behaviour only than to mangle them in one class; and to evolve them separately
  • Functions in a separate module tend to be more general
  • Cleaner interface segregation (ISP): when a client only needs the data, it shouldn't be exposed to behaviour
  • Better compositionality. For example,

     case class Interval(lower: Double, upper: Double)
    
     trait IntervalService{ 
    def contained(a: Interval, b: Interval) }
    object IntervalService extends IntervalService
    trait MathService{ //methods}
    

    is composed simply as object MathHelper extends IntervalService with MathService. It is not as simple with behaviour-rich classes.

So normally I keep case class for data; the companion object for factory and validation methods; and service modules for other behaviour. I may put a couple methods facilitating data access inside a case class: def row(i:Int) for a class with a table. (In fact, the OP's example seems similar to this.)

There're downsides: extra classes/traits are necessary; both a class instance and a service object can be required by clients; method definitions can be confusing: eg, in

import IntervalService._
contains(a, b)
a.contains(b)

the second is more clear w.r.t. which interval contains which.

Sometimes combining data and methods in a class seems more natural (esp. with mediators/controllers in the UI layer). Then I'd define class Controller(a: A, b: B) with methods and private fields, to distinguish it from a data-only case class.

schrödingcöder
  • 565
  • 1
  • 9
  • 18