Benchmark
Because I had too much time ;-), I asked myself, as it is with the runtime of the different approaches to get a feeling for whether or not a heavy construct is lurking behind a lightweight syntax.
So I created a micro-benchmark for measuring the running time for the three seqs
(1, 3, 3, 3, 1, 2, 2, 1)
(1, 2, 3, 4, 5, 6, 7, 8, 8, 8, 8, 8, 7, 6, 5, 4, 3, 3, 3, 2, 1, 2, 3)
(2, 2, 3, 4, 5, 6, 7, 8, 8, 8, 8, 8, 8, 8, 8, 7, 6, 5, 4, 4, 4, 4, 3, 3, 3, 2, 1)
and got the following result:
Summary
EDIT: new results (begin):
On incorporation of results in my real project I came across inconsistencies regarding the benchmarks. So I repeated the benchmark again with more warm-up laps (now 1000), so the JIT compiler gets the most out of the code. Thus was shuffling the results and born a new favorite for me: X7 (pimp my lib) = pleasure without remorse. And the List
version X8 (reverse.foldLeft) are now very fast too.
Nr (Approach) Running time (ns) Contributor
X2 (poor.reference.impl) in 15.202 ns
X1 (original while loop) in 8.166 ns
X3 (tail recursion) in 7.473 ns
X4 (recursion with ++) in 6.671 ns Peter Schmitz
X5 (simplified recursion with ++) in 6.161 ns Peter Schmitz
X6 (foldRight) in 4.083 ns tenshi
X7 (pimp my lib) in 1.677 ns Rex Kerr
X8 (reverse.foldLeft) in 1.349 ns tenshi
EDIT: new results (end)
old results:
Nr (Approach) Running time (ns) Contributor
X2 (poor.reference.impl) in 2.972.015 ns
X7 (pimp my lib) in 1.185.599 ns Rex Kerr
X3 (tail recursion) in 1.027.008 ns
X8 (reverse.foldLeft) in 643.840 ns tenshi
X6 (foldRight) in 608.112 ns ""
X1 (original while loop) in 564.726 ns
X4 (recursion with ++) in 468.478 ns Peter Schmitz
X5 (simplified recursion with ++) in 447.699 ns ""
Details
X2 (poor.reference.impl)
// in 15.202 ns
import collection.mutable.ArrayBuffer
def join2(seq: Seq[X]): Seq[Seq[X]] = {
var pp = Seq[ArrayBuffer[X]](ArrayBuffer(seq(0)))
for (i <- 1 until seq.size) {
if (seq(i) canJoin seq(i - 1)) {
pp.last += seq(i)
} else {
pp :+= ArrayBuffer(seq(i))
}
}
pp
}
X1 (while loop)
// in 8.166 ns
def join(xs: Seq[X]): Seq[Seq[X]] = {
var xss = Seq.empty[Seq[X]]
var s = xs
while (!s.isEmpty) {
val (p, r) = split(s)
xss :+= p
s = r
}
xss
}
This is the original imperative approach at the beginning of the question.
X3 (tail recursion)
// in 7.473 ns
def join(xs: Seq[X]): Seq[Seq[X]] = {
@annotation.tailrec
def jointr(xss: Seq[Seq[X]], rxs: Seq[X]): Seq[Seq[X]] = {
val (g, r) = split(rxs)
val xsn = xss :+ g
if (r.isEmpty) xsn else jointr(xsn, r)
}
jointr(Seq(), xs)
}
X4 (recursion with ++)
// in 6.671 ns
def join(seq: Seq[X]): Seq[Seq[X]] = {
if (seq.isEmpty) return Seq()
val (p, r) = split(seq)
Seq(p) ++ join(r)
}
X5 (simplified recursion with ++)
// in 6.161 ns
def join(xs: Seq[X]): Seq[Seq[X]] = if (xs.isEmpty) Seq() else {
val (p, r) = split(xs)
Seq(p) ++ join(r)
}
The simplifying is nearly the same but still a little bit faster.
X6 (foldRight)
// in 4.083 ns
def join(xs: Seq[X]) = xs.foldRight(Nil: List[List[X]]) {
case (x, (top :: group) :: rest) if x canJoin top => (x :: top :: group) :: rest
case (x, list) => (x :: Nil) :: list
}
Tried to avoid reverse
but foldRight
seems to be even a little bit worser than reverse.foldLeft
for list's.
X7 (pimp my lib)
// in 1.677 ns
import collection.generic.CanBuildFrom
class GroupingCollection[A, C, D[C]](ca: C)(
implicit c2i: C => Iterable[A],
cbf: CanBuildFrom[C, C, D[C]],
cbfi: CanBuildFrom[C, A, C]) {
def groupedWhile(p: (A, A) => Boolean): D[C] = {
val it = c2i(ca).iterator
val cca = cbf()
if (!it.hasNext) cca.result
else {
val as = cbfi()
var olda = it.next
as += olda
while (it.hasNext) {
val a = it.next
if (p(olda, a)) as += a
else { cca += as.result; as.clear; as += a }
olda = a
}
cca += as.result
}
cca.result
}
}
implicit def collections_have_grouping[A, C[A]](ca: C[A])(
implicit c2i: C[A] => Iterable[A],
cbf: CanBuildFrom[C[A], C[A], C[C[A]]],
cbfi: CanBuildFrom[C[A], A, C[A]]) = {
new GroupingCollection[A, C[A], C](ca)(c2i, cbf, cbfi)
}
// xs.groupedWhile(_ canJoin _)
X8 (reverse.foldLeft)
// in 1.349 ns
def join(xs: Seq[X]) = xs.reverse.foldLeft(Nil: List[List[X]]) {
case ((top :: group) :: rest, x) if x canJoin top => (x :: top :: group) :: rest
case (list, x) => (x :: Nil) :: list
}
Conclusion
The different approaches (X1, X3, X4, X5, X6) all play in the same league.
Because X7 (pimp my lib) allows a very concise usage xs.groupedWhile(_ canJoin _)
and cause the much necessary code can be hidden in the own util lib's, I've decided to use this in my real project.