19

From 2 lists of the form List[(Int, String):

l1 = List((1,"a"),(3,"b"))
l2 = List((3,"a"),(4,"c"))

how can I combine the Integers where the Strings are the same to get this third list:

l3 = List((4,"a"),(3,"b"),(4,"c"))

Right now I'm traversing both of the lists and adding if the strings are the same, but I think there should be a simple solution with pattern matching.

Xavier Guihot
  • 54,987
  • 21
  • 291
  • 190
Samuel Thomas
  • 193
  • 1
  • 1
  • 4
  • 1
    similar question: http://stackoverflow.com/questions/7076128/best-way-to-merge-two-maps-and-sum-the-values-of-same-key – Infinity Oct 01 '11 at 17:45
  • Is it just me, or does this problem seem easier to solve when you have List[(String, Int)] rather than List[(Int, String)]? – Duncan McGregor Oct 01 '11 at 20:20

8 Answers8

22
val l = l1 ::: l2
val m = Map[String, Int]()
(m /: l) {
  case (map, (i, s)) => { map.updated(s, i + (map.get(s) getOrElse 0))}
}.toList // Note: Tuples are reversed.

But I suppose there is a more elegant way to do the updated part.

Debilski
  • 66,976
  • 12
  • 110
  • 133
20

How about,

(l1 ++ l2).groupBy(_._2).mapValues(_.unzip._1.sum).toList.map(_.swap)

Unpacking this a little on the REPL helps to show what's going on,

scala> l1 ++ l2
res0: List[(Int, java.lang.String)] = List((1,a), (3,b), (3,a), (4,c))

scala> res0.groupBy(_._2)
res1: ... = Map(c -> List((4,c)), a -> List((1,a), (3,a)), b -> List((3,b)))

scala> res1.mapValues(_.unzip)
res2: ... = Map(c -> (List(4),List(c)), a -> (List(1, 3),List(a, a)), b -> (List(3),List(b)))                         

scala> res1.mapValues(_.unzip._1)                                                                                                                                                                      
res3: ... = Map(c -> List(4), a -> List(1, 3), b -> List(3))                                                                                    

scala> res1.mapValues(_.unzip._1.sum)
res4: ... = Map(c -> 4, a -> 4, b -> 3)                                                                                                               

scala> res4.toList                                                                                                                                                                                     
res5: List[(java.lang.String, Int)] = List((c,4), (a,4), (b,3))                                                                                                                                        

scala> res5.map(_.swap)
res6: List[(Int, java.lang.String)] = List((4,c), (4,a), (3,b))
Miles Sabin
  • 23,015
  • 6
  • 61
  • 95
  • 2
    While I tend to give myself a pat on the back every time I can implement a function without a newline, that's pretty opaque! Can you give some names to the intermediates that would make it obvious that it is correct? – Duncan McGregor Oct 01 '11 at 19:45
  • I'd recommend working through each intermediate result from beginning to the end on the REPL, starting with (l1 ++ l2) then (l1 ++ l2).groupBy(_._2) ... etc. – Miles Sabin Oct 02 '11 at 08:05
  • As it happens I had - but I'm interested - would you really leave a line of code like that in a source-base, or would you split it with explaining variable names? – Duncan McGregor Oct 04 '11 at 11:59
  • @DuncanMcGregor That's a very context dependent question ... maybe yes, maybe no. – Miles Sabin Oct 05 '11 at 08:10
  • The group by was a little confusing... I prefer use `.groupBy(x => x._2)`, it does the same thing. – Jaider May 02 '13 at 03:21
10

With Scalaz, this is a snap.

import scalaz._
import Scalaz._

val l3 = (l1.map(_.swap).toMap |+| l2.map(_.swap).toMap) toList

The |+| method is exposed on all types T for which there exists an implementation of Semigroup[T]. And it just so happens that the semigroup for Map[String, Int] is exactly what you want.

Ven
  • 19,015
  • 2
  • 41
  • 61
Apocalisp
  • 34,834
  • 8
  • 106
  • 155
1
for ( (k,v) <- (l1++l2).groupBy(_._2).toList ) yield ( v.map(_._1).sum, k )
AmigoNico
  • 6,652
  • 1
  • 35
  • 45
0
val a = List(1,1,1,0,0,2)
val b = List(1,0,3,2)

scala> List.concat(a,b)
res31: List[Int] = List(1, 1, 1, 0, 0, 2, 1, 0, 3, 2)

(or) 

scala> a.:::(b)
res32: List[Int] = List(1, 0, 3, 2, 1, 1, 1, 0, 0, 2)

(or) 

scala> a ::: b
res28: List[Int] = List(1, 1, 1, 0, 0, 2, 1, 0, 3, 2)
KARTHIKEYAN.A
  • 18,210
  • 6
  • 124
  • 133
0

An alternative to Miles Sabin's answer using Scala 2.13's new groupMapReduce method which is (as its name suggests) an equivalent (more efficient) of a groupBy followed by mapValues and a reduce step:

(l1 ::: l2).groupMapReduce(_._2)(_._1)(_ + _).toList.map(_.swap)
// List[(Int, String)] = List((3,b), (4,a), (4,c))

This:

  • prepends l1 to l2

  • groups elements based on their second tuple part (group part of groupMapReduce)

  • maps grouped values to their first tuple part (map part of groupMapReduce)

  • reduces values (_ + _) by summing them (reduce part of groupMapReduce)

  • and finally swaps tuples' parts.

This is an equivalent version performed in one pass (for the group/map/reduce part) through the List of:

(l1 ::: l2).groupBy(_._2).mapValues(_.map(_._1).reduce(_ + _)).toList.map(_.swap)
Xavier Guihot
  • 54,987
  • 21
  • 291
  • 190
0

Note that with this solution, the lists are traversed twice.

val l3 = (l1 zip l2).foldRight(List[(Int, String)]()) {
  case ((firstPair @ (firstNumber, firstWord),
        secondPair @ (secondNumber, secondWord)),
        result) =>
    if (firstWord == secondWord)
      ((firstNumber + secondNumber), firstWord) :: result
    else
      firstPair :: secondPair :: result
}
agilesteel
  • 16,775
  • 6
  • 44
  • 55
0

Another opaque onetwo-liner of questionable efficiency yet indubitable efficacy:

val lst = l1 ++ l2
lst.map(_._2).distinct.map(i => (lst.filter(_._2 == i).map(_._1).sum, i))
Luigi Plinge
  • 50,650
  • 20
  • 113
  • 180