19

Suppose I want to write a method with the following signature:

def parse(input: List[(String, String)]):
  ValidationNel[Throwable, List[(Int, Int)]]

For each pair of strings in the input, it needs to verify that both members can be parsed as integers and that the first is smaller than the second. It then needs to return the integers, accumulating any errors that turn up.

First I'll define an error type:

import scalaz._, Scalaz._

case class InvalidSizes(x: Int, y: Int) extends Exception(
  s"Error: $x is not smaller than $y!"
)

Now I can implement my method as follows:

def checkParses(p: (String, String)):
  ValidationNel[NumberFormatException, (Int, Int)] =
  p.bitraverse[
    ({ type L[x] = ValidationNel[NumberFormatException, x] })#L, Int, Int
  ](
    _.parseInt.toValidationNel,
    _.parseInt.toValidationNel
  )

def checkValues(p: (Int, Int)): Validation[InvalidSizes, (Int, Int)] =
  if (p._1 >= p._2) InvalidSizes(p._1, p._2).failure else p.success

def parse(input: List[(String, String)]):
  ValidationNel[Throwable, List[(Int, Int)]] = input.traverseU(p =>
    checkParses(p).fold(_.failure, checkValues _ andThen (_.toValidationNel))
  )

Or, alternatively:

def checkParses(p: (String, String)):
  NonEmptyList[NumberFormatException] \/ (Int, Int) =
  p.bitraverse[
    ({ type L[x] = ValidationNel[NumberFormatException, x] })#L, Int, Int
  ](
    _.parseInt.toValidationNel,
    _.parseInt.toValidationNel
  ).disjunction

def checkValues(p: (Int, Int)): InvalidSizes \/ (Int, Int) =
  (p._1 >= p._2) either InvalidSizes(p._1, p._2) or p

def parse(input: List[(String, String)]):
  ValidationNel[Throwable, List[(Int, Int)]] = input.traverseU(p =>
    checkParses(p).flatMap(s => checkValues(s).leftMap(_.wrapNel)).validation
  )

Now for whatever reason the first operation (validating that the pairs parse as strings) feels to me like a validation problem, while the second (checking the values) feels like a disjunction problem, and it feels like I need to compose the two monadically (which suggests that I should be using \/, since ValidationNel[Throwable, _] doesn't have a monad instance).

In my first implementation, I use ValidationNel throughout and then fold at the end as a kind of fake flatMap. In the second, I bounce back and forth between ValidationNel and \/ as appropriate depending on whether I need error accumulation or monadic binding. They produce the same results.

I've used both approaches in real code, and haven't yet developed a preference for one over the other. Am I missing something? Should I prefer one over the other?

Travis Brown
  • 138,631
  • 12
  • 375
  • 680
  • Wouldn't flatMap on ValidationNel have same result as fold in first example? I know Validation isn't Monad but still it has flatMap. Edit: Just noticed flatMap is deprecated in 7.1 with instructions to use `\\/` instead. Looks like there will be a third option! My answer would be that I just use `\/` because I do not want to accumulate failures. I would use `ValidationNel` in the rare case that I do – drstevens Nov 19 '13 at 12:38
  • `flatMap` on `Validation` is deprecated. – drexin Nov 19 '13 at 12:41
  • @drstevens: It would, but `flatMap` on `Validation` is deprecated in 7.1.0 (presumably so that it can't be used in `for`-comprehensions). – Travis Brown Nov 19 '13 at 12:42
  • 1
    @drstevens: I very often do want to accumulate failures! When you're e.g. processing a directory containing thousands of large JSON files you don't want to identify errors one by one. – Travis Brown Nov 19 '13 at 12:45
  • 1
    Of course I would accumulate errors when traversing through `input: List[(String, String)]`. You are not accumulating errors through `checkParses` and `checkValues` though. Further, `checkValues` will never return `Nel(InvalidSizes, InvalidSizes, ...)`, yet the type signature indicates that it may. I would personally never do this just like I would never `List[a]` when `Option[a]` would be more accurate. I have found that doing this makes it more challenging to reason about the range of possible errors. e.g. `parseFormatB` iff `parseFormatA` fails due to `InvalidSizes`. – drstevens Nov 20 '13 at 12:01
  • 1
    @drstevens: Fair point about the return type of `checkValues`—it's just a toy example, but I've changed it here. The key point of the question is that I do want to accumulate in `checkParses`—i.e. if neither member of the pair parses as an integer, I want to see both errors—and while traversing the list, but don't (can't) in `checkValues`, which comes in between. – Travis Brown Nov 20 '13 at 12:18
  • What does "disjunction" mean in the scope of this question? – Matt Fenwick Nov 20 '13 at 13:58
  • @MattFenwick: It's just a nicer name for Scalaz's `\/`. – Travis Brown Nov 20 '13 at 14:05
  • @TravisBrown See discussion on scalaz mailing list https://groups.google.com/forum/#!topic/scalaz/R-YYRoqzSXk. This should answer the bigger-picture question. – drstevens Jan 05 '14 at 01:50

2 Answers2

10

This is probably not the answer you're looking, but I just noticed Validation has the following methods

/** Run a disjunction function and back to validation again. Alias for `@\/` */
def disjunctioned[EE, AA](k: (E \/ A) => (EE \/ AA)): Validation[EE, AA] =
  k(disjunction).validation

/** Run a disjunction function and back to validation again. Alias for `disjunctioned` */
def @\/[EE, AA](k: (E \/ A) => (EE \/ AA)): Validation[EE, AA] =
  disjunctioned(k)

When I saw them, I couldn't really see their usefulness until I remembered this question. They allow you to do a proper bind by converting to disjunction.

def checkParses(p: (String, String)):
  ValidationNel[NumberFormatException, (Int, Int)] =
  p.bitraverse[
    ({ type L[x] = ValidationNel[NumberFormatException, x] })#L, Int, Int
  ](
    _.parseInt.toValidationNel,
    _.parseInt.toValidationNel
  )

def checkValues(p: (Int, Int)): InvalidSizes \/ (Int, Int) =
  (p._1 >= p._2) either InvalidSizes(p._1, p._2) or p

def parse(input: List[(String, String)]):
  ValidationNel[Throwable, List[(Int, Int)]] = input.traverseU(p =>
    checkParses(p).@\/(_.flatMap(checkValues(_).leftMap(_.wrapNel)))
  )
drstevens
  • 2,903
  • 1
  • 21
  • 30
  • 2
    +1, and thanks—I'd also never paid attention to `@\/`, and I like this approach better than either of mine. I'm going to hold out a little longer for a bigger-picture answer, though. – Travis Brown Nov 22 '13 at 13:58
2

The following is a pretty close translation of the second version of my code for Cats:

import scala.util.Try

case class InvalidSizes(x: Int, y: Int) extends Exception(
  s"Error: $x is not smaller than $y!"
)

def parseInt(input: String): Either[Throwable, Int] = Try(input.toInt).toEither

def checkValues(p: (Int, Int)): Either[InvalidSizes, (Int, Int)] =
  if (p._1 >= p._2) Left(InvalidSizes(p._1, p._2)) else Right(p)

import cats.data.{EitherNel, ValidatedNel}
import cats.instances.either._
import cats.instances.list._
import cats.syntax.apply._
import cats.syntax.either._
import cats.syntax.traverse._

def checkParses(p: (String, String)): EitherNel[Throwable, (Int, Int)] =
  (parseInt(p._1).toValidatedNel, parseInt(p._2).toValidatedNel).tupled.toEither

def parse(input: List[(String, String)]): ValidatedNel[Throwable, List[(Int, Int)]] =
  input.traverse(fields =>
    checkParses(fields).flatMap(s => checkValues(s).toEitherNel).toValidated
  )

To update the question, this code is "bouncing back and forth between ValidatedNel and Either as appropriate depending on whether I need error accumulation or monadic binding".

In the almost six years since I asked this question, Cats has introduced a Parallel type class (improved in Cats 2.0.0) that solves exactly the problem I was running into:

import cats.data.EitherNel
import cats.instances.either._
import cats.instances.list._
import cats.instances.parallel._
import cats.syntax.either._
import cats.syntax.parallel._

def checkParses(p: (String, String)): EitherNel[Throwable, (Int, Int)] =
  (parseInt(p._1).toEitherNel, parseInt(p._2).toEitherNel).parTupled

def parse(input: List[(String, String)]): EitherNel[Throwable, List[(Int, Int)]] =
  input.parTraverse(fields =>
    checkParses(fields).flatMap(checkValues(_).toEitherNel)
  )

We can switch the the par version of our applicative operators like traverse or tupled when we want to accumulate errors, but otherwise we're working in Either, which gives us monadic binding, and we no longer have to refer to Validated at all.

Travis Brown
  • 138,631
  • 12
  • 375
  • 680