2

I am looking for a collections method which splits at a given pairwise condition, e.g.

val x = List("a" -> 1, "a" -> 2, "b" -> 3, "c" -> 4, "c" -> 5)

implicit class RichIterableLike[A, CC[~] <: Iterable[~]](it: CC[A]) {
  def groupWith(fun: (A, A) => Boolean): Iterator[CC[A]] = new Iterator[CC[A]] {
    def hasNext: Boolean = ???

    def next(): CC[A] = ???
  }
}

assert(x.groupWith(_._1 != _._1).toList == 
  List(List("a" -> 1, "a" -> 2), List("b" -> 3), List("c" -> 4, "c" -> 5))
)

So this is sort of a recursive span.

While I'm capable of implementing the ???, I wonder

  • if something already exists in collections that I'm overseeing
  • what that method should be called; groupWith doesn't sound right. It should be concise, but somehow reflect that the function argument operates on pairs. groupWhere would be a bit closer I guess, but still not clear.
  • actually I guess when using groupWith, the predicate logic should be inverted, so I would use x.groupWith(_._1 == _._1)
  • thoughts about the types. Returning an Iterator[CC[A]] looks reasonable to me. Perhaps it should take a CanBuildFrom and return an Iterator[To]?
0__
  • 66,707
  • 21
  • 171
  • 266
  • 1
    Looks like more or less duplicate of http://stackoverflow.com/questions/7293617/split-up-a-list-at-each-element-satisfying-a-predicate-scala – om-nom-nom Oct 13 '13 at 17:36
  • @om-nom-nom with the exception that my predicate needs to compare two successive elements.. – 0__ Oct 13 '13 at 17:39
  • Conceptually it seems more similar to `span` than `groupBy` so why not just call it `spans` (as in computing every span of the list)? – Dave L. Oct 13 '13 at 18:10

3 Answers3

1

So here is my suggestion. I sticked to groupWith, because spans is not very descriptive in my opinion. It is true that groupBy has very different semantics, however there is grouped(size: Int) which is similar.

I tried to create my iterator purely based on combining existing iterators, but this got messy, so here is the more low level version:

import scala.collection.generic.CanBuildFrom
import scala.annotation.tailrec
import language.higherKinds

object Extensions {
  private final class GroupWithIterator[A, CC[~] <: Iterable[~], To](
      it: CC[A], p: (A, A) => Boolean)(implicit cbf: CanBuildFrom[CC[A], A, To])
    extends Iterator[To] {

    private val peer      = it.iterator
    private var consumed  = true
    private var elem      = null.asInstanceOf[A]

    def hasNext: Boolean = !consumed || peer.hasNext

    private def pop(): A = {
      if (!consumed) return elem
      if (!peer.hasNext)
        throw new NoSuchElementException("next on empty iterator")

      val res   = peer.next()
      elem      = res
      consumed  = false
      res
    }

 

    def next(): To = {
      val b = cbf()

      @tailrec def loop(pred: A): Unit = {
        b       += pred
        consumed = true
        if (!peer.isEmpty) {
          val succ = pop()
          if (p(pred, succ)) loop(succ)
        }
      }

      loop(pop())
      b.result()
    }
  }

 

  implicit final class RichIterableLike[A, CC[~] <: Iterable[~]](val it: CC[A])
    extends AnyVal {
    /** Clumps the collection into groups based on a predicate which determines
      * if successive elements belong to the same group.
      *
      * For example:
      * {{
      *   val x = List("a", "a", "b", "a", "b", "b")
      *   x.groupWith(_ == _).to[Vector]
      * }}
      *
      * produces `Vector(List("a", "a"), List("b"), List("a"), List("b", "b"))`.
      *
      * @param p    a function which is evaluated with successive pairs of
      *             the input collection. As long as the predicate holds
      *             (the function returns `true`), elements are lumped together.
      *             When the predicate becomes `false`, a new group is started.
      *
      * @param cbf  a builder factory for the group type
      * @tparam To  the group type
      * @return     an iterator over the groups.
      */
    def groupWith[To](p: (A, A) => Boolean)
                     (implicit cbf: CanBuildFrom[CC[A], A, To]): Iterator[To] =
      new GroupWithIterator(it, p)
  }
}

That is, the predicate is inverted as opposed to the question.

import Extensions._
val x = List("a" -> 1, "a" -> 2, "b" -> 3, "c" -> 4, "c" -> 5)
x.groupWith(_._1 == _._1).to[Vector]
// -> Vector(List((a,1), (a,2)), List((b,3)), List((c,4), (c,5)))
0__
  • 66,707
  • 21
  • 171
  • 266
1

You could achieve it with a fold too right? Here is an unoptimized version:

  def groupWith[A](ls: List[A])(p: (A, A) => Boolean): List[List[A]] = 
    ls.foldLeft(List[List[A]]()) { (acc, x) => 
      if(acc.isEmpty)
        List(List(x))
      else
        if(p(acc.last.head, x))
          acc.init ++ List(acc.last ++ List(x))
        else 
          acc ++ List(List(x))
    }

  val x = List("a" -> 1, "a" -> 2, "b" -> 3, "c" -> 4, "c" -> 5, "a" -> 4)    
  println(groupWith(x)(_._1 == _._1))
  //List(List((a,1), (a,2)), List((b,3)), List((c,4), (c,5)), List((a,4)))
Dave L.
  • 9,595
  • 7
  • 43
  • 69
1

You can also write a version that uses tailrec/pattern matching:

  def groupWith[A](s: Seq[A])(p: (A, A) => Boolean): Seq[Seq[A]] = {
    @tailrec
    def rec(xs: Seq[A], acc: Seq[Seq[A]] = Vector.empty): Seq[Seq[A]] = {
      (xs.headOption, acc.lastOption) match {
        case (None, _) => acc
        case (Some(a), None) => rec(xs.tail, acc :+ Vector(a))
        case (Some(a), Some(group)) if p(a, group.last) => rec(xs.tail, acc.init :+ (acc.last :+ a))
        case (Some(a), Some(_)) => rec(xs.tail, acc :+ Vector(a))
      }
    }

    rec(s)
  }
Ryan LeCompte
  • 4,281
  • 1
  • 14
  • 14