6

Following up this question I wonder why maxBy of Traversable[T] returns a single value T instead of a sequence of T (list or similar). It looks like a pretty common case. For example (from the previous question):

For the list of students with their grades

List(Student("Mike", "A"), Student("Pete", "B"), Student("Paul", A))"

I want to get

List(Student("Mike", "A"), Student("Paul", A))

Does anybody know about any standard implementation of maxBy, which returns a sequence of found items?

Community
  • 1
  • 1
Michael
  • 10,185
  • 12
  • 59
  • 110
  • 1
    Wouldn't that be (descending) sorting? –  Nov 22 '11 at 19:34
  • @delnan No, I need O(N) algorithm. – Michael Nov 22 '11 at 19:42
  • Please clarify. I read your current question as asking for a method providing the n largest items from a collection of size m. That's obviously not possible in O(n) and generally pretty much like sorting. –  Nov 22 '11 at 19:45
  • @delnan I have added an example to the question. – Michael Nov 22 '11 at 19:54
  • 2
    So you actually want to get all items equal to the largest item? Then just state that unambiguously! Anyway, I think it's not possible in O(N) worst-case (which is often all people care about) as the number of largest items isn't constant. –  Nov 22 '11 at 19:59
  • 2
    @delnan sure it is. See my answer, which traverses the list exactly once. (Assuming of course that the comparison function is *O(1)* time complexity) – Dan Burton Nov 22 '11 at 20:39
  • @DanBurton complexity isn't dependent on how many times you traverse the list. It's how the computation scales with list length. – Luigi Plinge Nov 22 '11 at 21:50
  • @LuigiPlinge: yes, that's correct. I was implicitly referring to the fact that O(1) work was performed at each step of the traversal in my (Haskell...whoops :P) implementation, hence the total time complexity was O(n), where n is the length of the list. – Dan Burton Nov 22 '11 at 22:25
  • Yes, the complexity thing was a brain fart on my side. –  Nov 23 '11 at 13:33
  • Useful. Would like to see this is the standard library, `maxsBy` perhaps.. – Ben Hutchison Feb 22 '15 at 05:51

4 Answers4

7

There is no single command. The shortest I know of--which will group everything not just the maximum value as an intermediate--is

xs.groupBy(f).maxBy(_._1)._2

For greater efficiency, folds are good general-purpose tools for finding sums and maxima and various similar things. Basically, any time you need to run over your collection while accumulating some answer, use a fold. In this case,

(xs.head /: xs.tail) {
  (biggest, next) => if (f(biggest) < f(next)) next else biggest
}

will perform maxBy(f) if you don't mind re-evaluating the function twice for each element, while

((xs.head, f(xs.head)) /: xs.tail) {
  case (scored, next) =>
    val nextscore = f(next)
    if (scored._2 < nextscore) (next, nextscore)
    else scored
}._1

will do it with only one evaluation per element. If you want to keep a sequence, you can modify this to

(Seq(xs.head) /: xs.tail) {
  (bigs, next) =>
    if (f(bigs.head) > f(next)) bigs
    else if (f(bigs.head) < f(next)) Seq(next)
    else bigs :+ next
}

to keep the list (the corresponding single-evaluation form is left as an exercise to the reader).

Finally, even the near-maximum efficiency version isn't all that hard to manage, if you're willing to use a few mutable variables (hopefully well-hidden in a code block like I have here)

val result = {
  var bigs = xs.take(0).toList
  var bestSoFar = f(xs.head)
  xs.foreach{ x =>
    if (bigs.isEmpty) bigs = x :: bigs
    else {
      val fx = f(x)
      if (fx > bestSoFar) {
        bestSoFar = fx
        bigs = List(x)
      }
      else if (fx == bestSoFar) bigs = x :: bigs
    }
  }
  bigs
}

(this will return in reverse order, incidentally).

Rex Kerr
  • 166,841
  • 26
  • 322
  • 407
3

There is no function in the standard libraries that I am aware of.

maxBy' :: (a -> a -> Ordering) -> [a] -> [a]
maxBy' _ [] = undefined
maxBy' f (x:xs) = foldr step [x] xs
  where step y acc@(z:_) = case f y z of
          GT -> [y]
          EQ -> y:acc
          LT -> acc

[edit] Whoops, this is a Scala question :)

Translated to Scala, given a list xs and a comparator compare:

(List(xs.head) /: xs.tail) { (acc, y) =>
  y compare acc.head match {
    case 1  => List(y)
    case 0  => y :: acc
    case -1 => acc
  }
}
Dan Burton
  • 53,238
  • 27
  • 117
  • 198
  • 4
    Why are you writing a Haskell answer to a Scala question? There are some important differences in what is available in the library and how easy or hard it is to write a function operating on collections in various different ways and in which operators one uses to do comparisons and so on. – Rex Kerr Nov 22 '11 at 20:16
  • @RexKerr: whoa I had a brainfart. I saw Traversible and maxBy, and my brain went into Haskell mode. I accessed this from my feed reader, and thought it was a Haskell question for some reason. – Dan Burton Nov 22 '11 at 22:20
3

If you have

case class Student(name: String, grade: String)
val students = List(Student("Mike", "A"), Student("Pete", "B"), Student("Paul", "A"))

then this is a pretty simple O(N) solution, which doesn't involve building any intermediate lists:

val bestGrade = students.minBy(_.grade).grade
students.filter(_.grade == bestGrade)    //List(Student(Mike,A), Student(Paul,A))

We use minBy here because of the ordering of Strings.

As a method:

def multiMinBy[A,B](xs: Traversable[A])(f: A => B)(implicit ord: Ordering[B]) = {
  val minVal = f(xs minBy f)
  xs filter (f(_) == minVal)
}

scala> multiMinBy(students)(_.grade)
res26: Traversable[Student] = List(Student(Mike,A), Student(Paul,A))
Luigi Plinge
  • 50,650
  • 20
  • 113
  • 180
1

Student and List of Student:

class Student (val name: String, val grade: String) {
  override def toString = grade + "::" + name
}
val students = List (new Student ("Mike", "A"), new Student ("Pete", "B"), new Student ("Paul", "A"))

Functional, tairecursive solution, parametrized over List of T and a method to compare 2 T's:

// ext: extreme, o: other, s:sample(student)
@tailrec
def collectExtreme [T](l: List[T], ext: ((T, T) => Int), carry: List[T]=List.empty) : List[T] =
  l match {
    case Nil => carry
    case s :: xs => carry match {
      case Nil => collectExtreme (xs, ext, List (s))
      case o :: _ => ext (s, o) match {
        case 0 => collectExtreme (xs, ext, s :: carry)
        case -1=> collectExtreme (xs, ext, l)
        case 1 => collectExtreme (xs, ext, carry)
      }
    }
  }
def cmp (s: Student, o: Student): Int = s.grade(0) - o.grade(0) 

collectExtreme (students, cmp) 

Runs only 1 times over the collection too.

user unknown
  • 35,537
  • 11
  • 75
  • 121