2

I'm writing a simple, short bit of code to demonstrate to a novice programmer how a series of random numbers converges in the mean. To do this, I've generated an array of tuples that store the size and mean of a randomly-sized and filled array. Here is the code I use to do that:

val random = new scala.util.Random()

def gen(random:scala.util.Random) = {
    val array = Array.fill(2 + random.nextInt(999)) { random.nextInt(100) }
    val sum = array.reduceLeft(_ + _)
    val mean = sum.toDouble / array.size
    (array.size, mean)
}

val array = Array.fill(10000) { gen(random) }

I then want to calculate the mean of the means of equal-sized arrays, put that in an array, and sort it by size of the original array. So, if I had an array of tuples: (2, 57), (2, 22), (2, 40), I would like a single entry of (2, (57+22+40)/3), and so on for each entry in the array.

I'm stuck how to do this in an elegant, idiomatic, and clear way in Scala. Would someone be able to help with that? And, if you have any constructive criticism for the above code, that would also help.

Thanks.

Bill Lear
  • 141
  • 2
  • 4

1 Answers1

9

assuming:

def mean(xs: Iterable[Double]) = xs.sum / xs.size

then:

array
  .groupBy(_._1)
  .mapValues(xs => mean(xs.map(_._2)))
  .toArray
  .sortBy(_._1)
Seth Tisue
  • 29,985
  • 11
  • 82
  • 149
  • Isn't `xs.map(_._2))` the same as `xs.values`? – The Archetypal Paul May 24 '11 at 08:50
  • @Paul: xs is an Array, not a Map. The type of `array.groupBy(_._1)` is `Map[Int,Array[(Int, Double)]]`. – Seth Tisue May 24 '11 at 13:23
  • I have one follow-up question. How do I use the 'mean' function that you defined on arrays of integers? I tried using it instead of the in-place mean calculation in the 'gen' method and the scala compiler complained of a type mismatch. I tried making the 'mean' function "more generic" to accept an Array of any type, but I couldn't figure out the magic. Is there an easy way to do that? – Bill Lear May 29 '11 at 13:18
  • Um, sort of, but involves the Numeric trait, and it's not that straightforward because the result type isn't always the same as the input type, for example, obviously you want the average of some Doubles to be a Double, but if you're averaging Ints, what answer do you want — Int, Float, Double? I'd suggest asking it as a separate question. – Seth Tisue May 30 '11 at 15:48
  • I'd like the result type to be Double, I suppose. I posted a new question on this topic here: http://stackoverflow.com/questions/6188990/writing-a-generic-mean-function-in-scala – Bill Lear May 31 '11 at 14:21