0

I want to merge multiple maps that are in a list. Each map has two key-value pairs.

What I have...

val input = List[Map[String, String]]
Map[a -> b, c -> d],
Map[a -> b, c -> e],
Map[a -> f, c -> h]

What I want...

val output = Map[String, List[String]]
Map[b -> (d, e), f -> (h)]

I've researched but the closest I could find was this (Scala: Merge maps by key), which is not the scenario I am looking for. Ideally, I would appreciate an explanation rather than just a line of code. I know this can be done with for-loops, but I am trying to learn the Scala way of merging maps.

EDIT: After some discussion in the comments, I decided to simplify the question a bit. The keys 'a' and 'c' are static/not relevant/can be hard coded.

The goal is to make new maps, where the value associated with key 'a' would be the key, and the value associated with key 'c' would be the value. Once all the new maps are made, the ones with a similar key can be grouped together, and all their values can be placed in a list.

Community
  • 1
  • 1
  • Why does, for example, `f` become the key and `h` the value and not the other way around? `Maps` are, by definition, unordered so you need to specify what rules are used to determine which value becomes a key and which value becomes a member of the value list. – jwvh Mar 19 '17 at 03:47
  • Many of the 'f' values (or b) are repeated, and hence it is cleaner to make them the key. All of the 'h' values (or d/e) are unique, hence why they can be grouped in a list. – D Williams Mar 19 '17 at 03:54
  • Sorry for deleting comment, that was an accident. Okay I understand now. Thanks for the explanation. – Tanjin Mar 19 '17 at 04:00
  • @Tanjin no problem – D Williams Mar 19 '17 at 04:03
  • I still don't see it. `d`, `e`, `f`, and `h` are all unique in the example but, for some reason, `f` is supposed to become a key and the others are collected into `List` values. What's the logic? – jwvh Mar 19 '17 at 04:04
  • @jwvh look at the original Maps, the keys are irrelevant, the ultimate goal is to merge based on values. Focus on the values. – Tanjin Mar 19 '17 at 04:06
  • It is important to note that in the input, we are looking at 3 independent maps. The value 'b' appears in separate maps (associated with the key 'a'), and as such, the values' associated with the second key 'c', are to be all grouped together. When we look at the third map, there is no value 'b' for the key 'a', and as such, the value associated with key 'c' does not belong in the same grouping as the others previously. – D Williams Mar 19 '17 at 04:09
  • @Tanjin correct. The keys are not relevant. – D Williams Mar 19 '17 at 04:09
  • @DWilliams, "associated with the second key", but `Map` keys are unordered. There is no 1st or 2nd key unless you specify that they are to be retrieved in some sorted/ordered manner. If there's a logic why `f` becomes a key and `h` does not, you haven't stated it. – jwvh Mar 19 '17 at 04:13
  • @jwvh I see your argument. I understand that map keys are unordered, and in this case the value for the key 'a' will always be fetched first, it is hard coded. – D Williams Mar 19 '17 at 04:18

2 Answers2

2

The idea is to first extract all the (key, value) pairs before using groupBy and finally mapping the values:

val input: List[Map[String, String]] = ...

val res: Map[String, List[String]] =
  input
    .flatten                        // List[(String, String)]
    .groupBy { case (k, _) => k }   // Map[String, List[(String, String)]]
    .mapValues(_.map { case (_, v) => v })   // Map[String, List[String]]
    
Jean Logeart
  • 52,687
  • 11
  • 83
  • 118
1

OK, try this.

val input: List[Map[String, String]] = List( Map("a" -> "b", "c" -> "d")
                                           , Map("a" -> "b", "c" -> "e")
                                           , Map("a" -> "f", "c" -> "h")
                                           )

input.map(m => (m("a"), m("c"))) //List((b,d), (b,e), (f,h))
     .groupBy(_._1)              //Map(b -> List((b,d), (b,e)), f -> List((f,h)))
     .mapValues(_.map(_._2))     //Map(b -> List(d, e), f -> List(h))
  1. retrieve the values and put them in a tuple
  2. make the 1st element a key to the tuple(s)
  3. un-tuple by extracting the 2nd element
jwvh
  • 50,871
  • 7
  • 38
  • 64
  • This works! I understand what is happening in each step (thanks to your in-line comments), but where can I learn more about the notation that you've used inside the parenthesis? For instance, I do not understand what this means: (_._1) – D Williams Mar 19 '17 at 16:23
  • @DWilliams, The 1st underscore is shorthand for the argument passed to the anonymous function (lambda) i.e. a stand-in for each element of the current collection (a `List` or `Map` for example). The 2nd underscore w/ number is the mechanism for accessing/extracting tuple elements. It's mentioned [here](http://stackoverflow.com/documentation/scala/4971/tuples#t=201703192204320234954) and [here](http://stackoverflow.com/documentation/scala/930/extractors/3070/tuple-extractors#t=201703192159581456976). – jwvh Mar 19 '17 at 22:26