0

I want to make some optimzation to foldLeft using in-memmory functions using some collections Considering the following Code:


val buffer = List.fill(10000)(Random.nextInt(10))

def `with list appending`() = buffer.foldLeft(List[Int](), List[Int]()) { case (_@(even, odd), currNum) => {
  if (currNum % 2 == 0) (even :+ currNum, odd)
  else (even, odd :+ currNum)
}
}

def `with list pre appending` = {
  val (evenList,oddList) = buffer.foldLeft(List[Int](), List[Int]()) { case (_@(even, odd), currNum) => {
    if (currNum % 2 == 0) (currNum :: even, odd)
    else (even, currNum :: odd)
  }
  }
  (evenList.reverse, oddList.reverse)
}

def `with seq appending` = buffer.foldLeft(Seq[Int](), Seq[Int]()) { case (_@(even, odd), currNum) => {
  if (currNum % 2 == 0) (even :+ currNum, odd)
  else (even, odd :+ currNum)
}
}

def `with mutable list appending` = buffer.foldLeft(mutable.MutableList[Int](), mutable.MutableList[Int]()) { case (_@(even, odd), currNum) => {
  if (currNum % 2 == 0) (even :+ currNum, odd)
  else (even, odd :+ currNum)
}
}
  1. Does each time that result is aggregated with List, the whole collection is copied?
  2. Does :+ on Seq copying to new Seq or just appended element in the end? probably - O(1)?
  3. Does :+ on List copying to new List or just appending element in the end? probably - O(n)?
  4. Does mutableList is faster then foldLeft with List since there no copying on each aggregation?
  5. Do you recommend to foldLeft in another way for better performance? Thanks!
Zvi Mints
  • 1,072
  • 1
  • 8
  • 20

3 Answers3

3

.foldLeft is implemented using a while loop or tail recursion - so its performance depends on iteration speed.

List copies everything on appending (:+) and just creates a cons instance on prepending (O(1)).

Seq might be everything so you have no guarantees about the performance.

For building a List ListBuffer would a good option, it should be similar to a mutable list.

Using .foldLeft to build a List of any sort is almost always a bad idea, it almost always could be replaced with .map, .filter, .groupBy, .flatMap and if you want to avoid intermediate representations then .view would also some handy.

All your examples could be replaced with

buffer.partition(_ % 2 == 0)

and I would expect that it would have a better at at worst case comparable performance to a hand-rolled function.

If the example would be more complex I would suggest

  • using a specific collection (never Seq)
  • checking if there is some build-in what does what you want as it probably is optimized to do what is says and fast
  • using Java Microbenchmark Harness
Mateusz Kubuszok
  • 24,995
  • 4
  • 42
  • 64
  • Regarding the parition, In my other application the case is a little bit harder, with more tuples so its not a such a good solution that can fit my product - I updated times for the ListBuffer in the comment before, its still run slower - im missing somthing? – Zvi Mints Jan 13 '21 at 09:36
3

First you need to read the documentation which will tell you that appending to a list is linear time and therefore slow.

Then you need to remember that the JVM has a JIT compiler so you won't get useful performance figures without warming up the code first.

Finally, if performance is really that critical then write your own recursive routine for this, rather than using a library method, so that you avoid overheads.

def recursive(buffer: List[Int]) = {
  @annotation.tailrec
  def loop(rem: List[Int], even: List[Int], odd: List[Int]): (List[Int], List[Int]) =
    rem match {
      case Nil =>
        (even.reverse, odd.reverse)
      case i :: tail =>
        if (i % 2 == 0) {
          loop(tail, i :: even, odd)
        } else {
          loop(tail, even, i :: odd)
        }
    }

  loop(buffer, Nil, Nil)
}
Tim
  • 26,753
  • 2
  • 16
  • 29
0

Update: I see that

import scala.collection.mutable
import scala.collection.mutable.ListBuffer
import scala.util.Random
def measureTime[T](block: => T) = {
  val t0 = System.currentTimeMillis
  block
  val t1 = System.currentTimeMillis
  Console println (s"Operation took ${t1 - t0} mills")
}

val buffer = List.fill(10000)(Random.nextInt(10))

def `with list appending`() = buffer.foldLeft(List[Int](), List[Int]()) { case (_@(even, odd), currNum) => {
  if (currNum % 2 == 0) (even :+ currNum, odd)
  else (even, odd :+ currNum)
}
}

def `with list pre appending` = {
  val (evenList,oddList) = buffer.foldLeft(List[Int](), List[Int]()) { case (_@(even, odd), currNum) => {
    if (currNum % 2 == 0) (currNum :: even, odd)
    else (even, currNum :: odd)
  }
  }
  (evenList.reverse, oddList.reverse)
}


def `with mutable list appending` = buffer.foldLeft(mutable.MutableList[Int](), mutable.MutableList[Int]()) { case (_@(even, odd), currNum) => {
  if (currNum % 2 == 0) (even :+ currNum, odd)
  else (even, odd :+ currNum)
}
}

def `with list buffer` =  buffer.foldLeft(ListBuffer[Int](), ListBuffer[Int]()) { case (_@(even, odd), currNum) => {
  if (currNum % 2 == 0) (even += currNum, odd)
  else (even, odd += currNum)
}
}

def `with partition` = buffer.partition(_ % 2 == 0)



measureTime(`with mutable list appending`)
measureTime(`with list appending`)
measureTime(`with list pre appending`)
measureTime(`with list buffer`)
measureTime(`with partition`)

Outputs:

Operation took 4356 mills
Operation took 2375 mills
Operation took 7 mills
Operation took 5 mills
Operation took 9 mills

So the conclusion with list pre appending? how its faster than mutable list / list buffer? Question: List not copping the whole list each time foldLeft occurred?

Zvi Mints
  • 1,072
  • 1
  • 8
  • 20
  • 1
    If you care about performance you need to read the performance documentation. [this article](https://docs.scala-lang.org/overviews/collections-2.13/performance-characteristics.html) gives the relative performance of different collections and operations, from which it is obvious why you get these figures. – Tim Jan 13 '21 at 11:41
  • 1
    `measureTime` will not give you reliable results; benchmarking on the JVM isn't so easy. You should use JMH if you care whether the numbers you're getting mean anything. – Seth Tisue Jan 14 '21 at 00:45