2

I'm fairly new to Scala, so hopefully you tolerate this question in the case you find it noobish :)

I wrote a function that returns a Seq of elements using yield syntax:

def calculateSomeMetrics(names: Seq[String]): Seq[Long] = {
  for (name <- names) yield {
    // some auxiliary actions
    val metrics = somehowCalculateMetrics()
    metrics
  }
}

Now I need to modify it to return a Map to preserve the original names against each of the calculated values:

def calculateSomeMetrics(names: Seq[String]): Map[String, Long] = { ... }

I've attempted to use the same yield-syntax but to yield a tuple instead of a single element:

def calculateSomeMetrics(names: Seq[String]): Map[String, Long] = {
  for (name <- names) yield {
    // Everything is the same as before
    (name, metrics)
  }
}

However, the compiler interprets it Seq[(String, Long)], as per the compiler error message

type mismatch;
  found   : Seq[(String, Long)]
  required: Map[String, Long]

So I'm wondering, what is the "canonical Scala way" to implement such a thing?

Chirlo
  • 5,989
  • 1
  • 29
  • 45
Vasiliy Galkin
  • 1,894
  • 1
  • 14
  • 25

3 Answers3

8

The efficient way of creating different collection types is using scala.collection.breakOut. It works with Maps and for comprehensions too:

import scala.collection.breakOut

val x: Map[String, Int] = (for (i <- 1 to 10) yield i.toString -> i)(breakOut)

x: Map[String,Int] = Map(8 -> 8, 4 -> 4, 9 -> 9, 5 -> 5, 10 -> 10, 6 -> 6, 1 -> 1, 2 -> 2, 7 -> 7, 3 -> 3)

In your case it should work too:

import scala.collection.breakOut

def calculateSomeMetrics(names: Seq[String]): Map[String, Long] = {
  (for (name <- names) yield {
    // Everything is the same as before
    (name, metrics)
  })(breakOut)
}

Comparison with toMap solutions: before toMap creates an intermediate Seq of Tuple2s (which incidentally might be a Map too in certain cases) and from that it creates the Map, while breakOut omits this intermediate Seq creation and creates the Map directly instead of the intermediate Seq.

Usually this is not a huge difference in memory or CPU usage (+ GC pressure), but sometimes these things matter.

Gábor Bakos
  • 8,982
  • 52
  • 35
  • 52
  • 2
    I would stray away from using `breakOut` until this specific code becomes a bottleneck. It is counter-intuitive to read and comprehend. – Yuval Itzchakov Nov 29 '17 at 13:37
  • @YuvalItzchakov I agree, that is why I added the last sentence. – Gábor Bakos Nov 29 '17 at 13:56
  • In my case the gain is not big enough compared to the possible "WTF's" that can be caused by the usage of `breakOut`. On another hand, today I've learned something new, so thanks for pointing it out :) – Vasiliy Galkin Nov 29 '17 at 14:18
  • 2
    Note that [breakout is going away](http://www.scala-lang.org/blog/2017/05/30/tribulations-canbuildfrom.html#breakout-escape-hatch) in the new collections design. – Karl Bielefeldt Nov 29 '17 at 15:27
6

Either:

def calculateSomeMetrics(names: Seq[String]): Map[String, Long] = {
  (for (name <- names) yield {
    // Everything is the same as before
    (name, metrics)
  }).toMap
}

Or:

names.map { name =>
  // doStuff
  (name, metrics)
}.toMap
Yuval Itzchakov
  • 146,575
  • 32
  • 257
  • 321
1

Several links here that either other people pointed me at or I managed to find out later on, just assembling them in a single answer for my future reference.

Vasiliy Galkin
  • 1,894
  • 1
  • 14
  • 25
  • Thanks for the mention but Gábor's answer is older. I was just trying to quickly look for a duplicate and found the linked related question. – Michał Politowski Nov 29 '17 at 15:18