0

I have a list of strings as shown below, which lists fruits and the cost associated with each. In case of no value, it is assumed to be 5:

val stringList: List[String] = List("apples 20", "oranges", "pears 10")

Now I want to split the string to get tuples of the fruit and the cost. What is the scala way of doing this?

stringList.map(query => query.split(" ")) 

is not what I want.

I found this which is similar. What is the correct Scala way of doing this?

navinpai
  • 955
  • 11
  • 33

2 Answers2

4

You could use a regular expression and pattern matching:

val Pat = """(.+)\s(\d+)""".r  // word followed by whitespace followed by number

def extract(in: String): (String, Int) = in match {
  case Pat(name, price) => (name, price.toInt)
  case _                => (in, 5)
}

val stringList: List[String] = List("apples 20", "oranges", "pears 10")

stringList.map(extract) // List((apples,20), (oranges,5), (pears,10))

You have two capturing groups in the pattern. These will be extracted as strings, so you have to convert explicitly using .toInt.

0__
  • 66,707
  • 21
  • 171
  • 266
  • Here is a bit more on pattern matching: https://stackoverflow.com/questions/4636610/how-to-pattern-match-using-regular-expression-in-scala – 0__ Apr 25 '16 at 20:04
  • 2
    `val r = """(\S+)\s*(\d+)?""".r ; def f(s: String) = s match { case r(n, null) => (n, 42) case r(n, i) => (n, i.toInt) case _ => (0,0) }` to show handling empty second group. Regex helps so much with verifying. – som-snytt Apr 26 '16 at 03:06
  • @som-snytt Thanks, my regex foo is rather limited – 0__ Apr 26 '16 at 09:06
3

You almost have it:

stringList.map(query => query.split(" ")) 

is what you want, just add another map to it to change lists to tuples:

.map { list => list.head -> list.lift(1).getOrElse("5").toInt }

or this instead, if you prefer:

.collect { 
    case Seq(a, b) => a -> b.toInt
    case Seq(a) => a -> 5
 }

(.collect will silently ignore the occurrences, where there are less than one or more than two elements in the list. You can replace it with .map if you would prefer it to through an error in such cases).

Dima
  • 39,570
  • 6
  • 44
  • 70