7

I've had this situation occur a number of times in the library I'm writing, and I'm not particularly satisfied with the solutions I've come up with so far.

Let's say that I have an expensive function f that takes an item of type T and returns a value of type Option[U]. Now, suppose I have a collection of type T and I want to retrieve the first non-None value returned by f when performed across the elements of T without evaluating f for all elements of T if the value has already been found.

The only way I've come up with to do this is to wrap F into an Extractor Object, and use it with scala's collectFirst method.

For example:

object FMatch { def unapply(t : T) = f(t) }

collection.collectFirst{ case FMatch(result) => result }

This seems a little inelegant, and I'm not certain whether f is evaluated only once or twice for each result (I haven't tested this to find out yet). It seems like it would be useful to have a version of collectFirst that takes a parameter of type T => Option[U] instead of a PartialFunction1[T].

Is there a more elegant way to do this that I'm missing?

alain.janinm
  • 19,951
  • 10
  • 65
  • 112
Nimrand
  • 1,748
  • 2
  • 16
  • 29

4 Answers4

16

Use a view over the collection, to make it lazy and defer invocation of that function until the last possible moment (e.g. it won't be called at all for elements beyond the first match):

xs.view map {f(_)} collectFirst {case Some(x) => x}

or

xs.view map {f(_)} find {_.isDefined}

or in the point-free style, as per Alexey's response:

xs.view map {f} find {_.isDefined}

That should hopefully give you a couple of alternative ways to think about the problem more generally :)

Kevin Wright
  • 49,540
  • 9
  • 105
  • 155
10

Use this:

collection.toIterator.map(f).find(_.isDefined)
Somnath Muluk
  • 55,015
  • 38
  • 216
  • 226
Alexey Romanov
  • 167,066
  • 35
  • 309
  • 487
  • I think this is probably about as good as it's going to get, except that this actually returns a value of type Option[Option[U]]. Instead, you'd want to do collection.toIterator.map(f).collectFirst{ case Some(x) => x }. I considered using views as in the other answer, but I was concerned about the overhead, as I read they can be expensive, but I hadn't thought of using an iterator, which should be very light-weight. – Nimrand Sep 05 '11 at 17:57
  • 3
    An `Option[Option[X]]` is easily handled by simply invoking `flatten` on the result, or using `flatMap`, or a for-comprehension; the best choice depends on your circumstances, I'd probably use `flatten` for something like this. Regarding views, they *should* be every bit as lightweight as an iterator for any given collection (if not lighter). Paul Phillips located some issues with views in the past, but the fixes for these were made long ago, they'll be in 2.9.1... – Kevin Wright Sep 05 '11 at 19:03
  • 1
    Good to know that they have views are more optimal now. I won't be so hesitant to use them, then. I hadn't thought about using flatten on the result, but that makes sense. – Nimrand Sep 05 '11 at 19:35
  • @Kevin Wright: I initially wrote this with a view, but decided it could hardly be more light-weight than an iterator and could easily be less so :) Apparently I was wrong. – Alexey Romanov Sep 05 '11 at 19:50
  • @Alexey I'm hardly an oracle on this stuff, but my personal opinion is that that you should use an optimizer if performance is critical; theory be damned, it's hard to argue with cold hard numbers. If performance isn't so critical, then `view` represents the intent better than an iterator (at least, in my mind it does), it's also a tiny bit more concise. – Kevin Wright Sep 05 '11 at 22:48
2
@annotation.tailrec 
def first[A, B](as: Traversable[A], f: A => Option[B]): Option[B] = 
  if (as.isEmpty) None 
  else f(as.head) match {
    case s @ Some(_) => s
    case _ => first(as.tail, f)
  }
Jed Wesley-Smith
  • 4,686
  • 18
  • 20
1

Give a partial functions to collectFirst

collectionT
  .collectFirst { t => 
    f(t) match { 
      case Some(u) => u 
    }
Alexey Ki
  • 41
  • 1
  • 5