4

I need a maxBy that returns all max values in case of equality.

Here is the signature and a first implementation :

def maxBy[A, B](as: Seq[A])(f: A => B)(implicit cmp: Ordering[B]) : Seq[A] = 
  as.groupBy(f).toList.maxBy(_._1)._2

Example :

maxBy(Seq(("a", "a1"),("a", "a2"),("b", "b1"),("b", "b2")))(_._1)
res6: Seq[(String, String)] = List(("b", "b1"), ("b", "b2"))

Updated with @thearchetypepaul comment

  def maxBy[A, B](l: Seq[A])(f: A => B)(implicit cmp: Ordering[B]) : Seq[A] = {
    l.foldLeft(Seq.empty[A])((b, a) =>
      b.headOption match {
        case None => Seq(a)
        case Some(v) => cmp.compare(f(a), f(v)) match {
          case -1 => b
          case 0 => b.+:(a)
          case 1 => Seq(a)
        }
      }
    )
  }

Is there a better way ?

Yann Moisan
  • 8,161
  • 8
  • 47
  • 91
  • 2
    Since you have a working answer already, and you haven't identified any shortcomings in your solution that make it unsuitable for you, this is a better fit for CodeReview. – Rex Kerr Jan 14 '16 at 21:24
  • Yes. foldLeft. Add the element to the acc if equal, replace acc with Seq(element) if element greater otherwise leave acc unchanged. Linear,one traverse, doesn't build groups of all the non-maximal elements. Not in front of a computer with Scala so not tested. – The Archetypal Paul Jan 14 '16 at 21:55
  • With my head still spinning from reading Functional Programming in Scala, is there any value in representing the result as a Monad - say a trait Contenders with implementations Winner(winner: A) and Tie(winners: Seq[A])? – richj Jan 14 '16 at 22:50
  • I'd try to cache the result of `f(v)`, so to not repeatedly call `f` on the same value many times. – chi Jan 15 '16 at 13:33
  • And `b` can be `None` only the first time through. So check for `l.isEmpty` once outside the `foldLeft` then `l.tail.foldLeft(l.head){...}` and the outer match isn't needed. Then make the accumulator a pair of the list so far, and `f(v)` so you don't have to recalculate `f(v)` each time. – The Archetypal Paul Jan 15 '16 at 18:39

2 Answers2

5

(1) Ordering#compare promises to denote the three possible results by a negative, positive, or zero number, not -1, 1, or 0 necessarily.

(2) Option#fold is generally (though not universally) considered to be more idiomatic than pattern matching.

(3) You are calling f possibly multiple times per element. TraversableOnce#maxBy used to do this before it was fixed in 2.11.

(4) You only accept Seq. The Scala library works hard to use CanBuildFrom to generalize the algorithms; you might want to as well.

(5) You can use the syntactic sugar B : Ordering if you would like.

(6) You prepend to the Seq. This is faster than appending, since prepending is O(1) for List and appending is O(n). But you wind up with the results in reverse order. foldRight will correct this. (Or you can call reverse on the final result.)

If you want to allow the use of CanBuildFrom,

def maxBy[A, Repr, That, B : Ordering](elements: TraversableLike[A, Repr])(f: A => B)(implicit bf: CanBuildFrom[Repr, A, That]): That = {
  val b = bf()
  elements.foldLeft(Option.empty[B]) { (best, element) =>
    val current = f(element)
    val result = best.fold(0)(implicitly[Ordering[B]].compare(current, _))
    if (result > 0) {
      b.clear()
    }
    if (result >= 0) {
      b += element
      Some(current)
    } else {
      best
    }
  }
  b.result
}

If you want to work with TraversableOnce,

def maxBy[A, B : Ordering](elements: TraversableOnce[A])(f: A => B): Seq[A] = {
  elements.foldRight((Option.empty[B], List.empty[A])) { case (element, (best, elements)) =>
    val current = f(element)
    val result = best.fold(0)(implicitly[Ordering[B]].compare(current, _))
    if (result > 0) {
      (Some(current), List(element))
    } else {
      (best, if (result == 0) element +: elements else elements)
    }
  }._2
}
Community
  • 1
  • 1
Paul Draper
  • 78,542
  • 46
  • 206
  • 285
  • The `bf` parameter should be implicit, right ? And `TraversableOnce#maxBy` doesn't seem to call f multiple times : https://github.com/scala/scala/blob/v2.11.7/src/library/scala/collection/TraversableOnce.scala#L232. BTW, thanks for such a complete answer ! – Yann Moisan Jan 15 '16 at 13:35
  • @YannMoisan, yes implicit. And, yes, `maxBy` appears to have been changed in 2.11. https://github.com/scala/scala/commit/ece18346582ffccb6d05b48b647ba5439aa2cca3. I will edit the answer. – Paul Draper Jan 15 '16 at 15:54
0

if the dataset is small the performance shouldn't concern you that much
and you can just sort, reverse, and takeWhile.

def maxBy[A,B:Ordering](l:List[A], f: A => B): List[A] = {
   l.sortBy(f).reverse match {
      case Nil    => Nil
      case h :: t => h :: t.takeWhile(x => f(x) == f(h))
   }
}

where the f should be an isomorphism on A.
and you can also save f(h) before comparison

Danny Mor
  • 1,143
  • 12
  • 12