5

Is there a way to declaratively initialize a dictionary from an array is swift? I'm looking for something like this:

struct MyStruct {

   var key: Int
   var value: String
}

let array = [MyStruct(key: 0, value: "a"), MyStruct(key: 1, value: "b")]
let dict = array.someTransform { // Some arguments
   // Some transformation
}

So that dict is of type [Int: String]?

Note: I'm not looking for a solution including forEach since from this task's point of view it's just a more sophisticated version of for loop.

user3581248
  • 964
  • 10
  • 22
  • Compare https://stackoverflow.com/questions/24116271/whats-the-cleanest-way-of-applying-map-to-a-dictionary-in-swift for a similar question. As of Swift 3, there is no built-in method to create a dictionary from an array (or sequence) of key/value pairs. That might change with Swift 4. – Martin R May 29 '17 at 14:52
  • `forEach` really isn't a more sophisticated version of a `for` loop – in fact it's *less* sophisticated in a number of different ways (e.g can't do pattern matching or `where` clauses, difficult to break out of). – Hamish May 29 '17 at 14:55
  • 1
    Now I found it: [SE-0165 Dictionary & Set Enhancements](https://github.com/apple/swift-evolution/blob/master/proposals/0165-dict.md). – Martin R May 29 '17 at 14:57

3 Answers3

16

Dictionary's sequence initialiser

In Swift 4, assuming that the keys are guaranteed to be unique, you can simply say:

let array = [MyStruct(key: 0, value: "a"), MyStruct(key: 1, value: "b")]

let dict = Dictionary(uniqueKeysWithValues: array.lazy.map { ($0.key, $0.value) })

print(dict) // [0: "a", 1: "c"]

This is using the init(uniqueKeysWithValues:) initialiser from SE-0165. It expects a sequence of key-value tuples, where the keys are guaranteed to be unique (you'll get a fatal error if they aren't). So in this case, we're applying a lazy transform to the elements in your array in order to get a lazy collection of key-value pairs.

If the keys aren't guaranteed to be unique, you'll need some way of deciding which of the possible values to use for the given key. To do this, you can use the init(_:uniquingKeysWith:) initialiser from the same proposal, and pass a given function to determine which value to use for a given key upon a duplicate key arising.

The first argument to the uniquingKeysWith: function is the value that's already in the dictionary, the second is the value attempting to be inserted.

For example, here we're overwriting the value each time a duplicate key occurs in the sequence:

let array = [MyStruct(key: 0, value: "a"), MyStruct(key: 0, value: "b"),
             MyStruct(key: 1, value: "c")]

let keyValues = array.lazy.map { ($0.key, $0.value) }
let dict = Dictionary(keyValues, uniquingKeysWith: { _, latest in latest })

print(dict) // [0: "b", 1: "c"]

To keep the first value for a given key, and ignore any subsequent values for the same key, you'd want a uniquingKeysWith: closure of { first, _ in first }, giving a result of [0: "a", 1: "c"] in this case.


Reduce with an inout accumulator

Another possible option in Swift 4, assuming you wish to merge any duplicate keys by overwriting the value at each occurrence of the given key is to use reduce(into:_:), introduced in SE-0171.

Unlike reduce(_:_:), this method uses an inout parameter for the accumulator in the combination function. This allows it to avoid the unnecessary copying of the accumulator that would otherwise occur at each iteration of reduce(_:_:) when populating a dictionary accumulator. This therefore allows us to populate it in linear, rather than quadratic time.

You can use it like so:

let array = [MyStruct(key: 0, value: "a"), MyStruct(key: 0, value: "b"),
             MyStruct(key: 1, value: "c")]

let dict = array.reduce(into: [:]) { $0[$1.key] = $1.value }

print(dict) // [0: "b", 1: "c"]


// with initial capacity to avoid resizing upon populating.
let dict2 = array.reduce(into: Dictionary(minimumCapacity: array.count)) { dict, element in
    dict[element.key] = element.value
}

print(dict2) // [0: "b", 1: "c"]
Hamish
  • 78,605
  • 19
  • 187
  • 280
1

using reduce

let dict = array.reduce([:]) { (d, s) -> [Int:String] in
    var d = d
    d[s.key] = s.value
    return d
}

as mentioned by @Martin R, this is not the best performer, but very easy to use. @Hamish's extension is nice, at least the same performance give you a little bit simpler

var dict:[Int:String] = [:]
for s in array {
    dict[s.key] = s.value
}

Yes, I see, that you would like to avoid forEach version, but in reality, it is a good and powerful solution.

var dict:[Int:String] = [:]
array.forEach {
    dict[$0.key] = $0.value
}

Making things as simple as possible reducing a chance to make unwanted side effect (bug)

Defining minimum capacity

var dict = Dictionary<Int,String>(minimumCapacity: array.count)
array.forEach {
    dict[$0.key] = $0.value
}

you have the best performer.

to compare the solutions

do {
    let start = Date()
    let dict = Dictionary(uniqueKeysWithValues: array.lazy.map { ($0.key, $0.value) })
    let time = start.timeIntervalSince(Date())
    print(1,time, dict.count)
}

do {
    let start = Date()
    var dict = Dictionary<Int,String>(minimumCapacity: array.count)
    array.forEach {
        dict[$0.key] = $0.value
    }
    let time = start.timeIntervalSince(Date())
    print(2,time, dict.count)
}

it prints on my computer

1 -1.93269997835159 10000000
2 -1.80712699890137 10000000

I like the idea of @Hamish using inout parameter for the accumulating function. I tested it with the same data set

do {
    let start = Date()
    let dict = array.reduce(into: Dictionary(minimumCapacity: array.count)) { dict, element in
        dict[element.key] = element.value
    }
    let time = start.timeIntervalSince(Date())
    print(3,time, dict.count)
}

I expected the same performance as the others above but unfortunately, it prints

3 -3.80046594142914 10000000

It looks like it needs cc twice a time to perform the same job.

user3441734
  • 16,722
  • 2
  • 40
  • 59
  • 1
    That is actually the least efficient solution: Temporary dictionaries are created for every iteration. Compare https://stackoverflow.com/a/28502842/1187415 and the following comments, or this article: https://airspeedvelocity.net/2015/08/03/arrays-linked-lists-and-performance/. – Martin R May 29 '17 at 16:19
  • @MartinR you absolutely right. For the same performance as Hamish's solution, I prefer simpler solutions. – user3441734 May 29 '17 at 17:19
  • That is interesting that `reduce(into:_:)` performs slower – I can't immediately think where the exact cause of the slowdown is. Although, unless to be used specifically in a high performance application, I don't think a difference by a factor of 2x is really that bad for most normal use cases. – Hamish May 30 '17 at 09:36
  • @Hamish absolutely – user3441734 May 30 '17 at 10:25
  • @Hamish most likely the reason is because inout isn't a pass-by-reference, it's just a mutable shadow copy of the parameter that's written back to the caller when the function exits – user3441734 Jun 02 '17 at 03:58
  • @user3441734 That was true in an earlier version of the language (Swift 2 I believe), however now `inout` parameters are compiled to a pass-by-reference (this can be seen by examining the SIL or IR emitted). But semantically, they should still be treated as copy-in copy-out. – Hamish Jun 02 '17 at 06:35
  • Did you test this in a release build with optimizations enabled? – Alexander Feb 04 '19 at 18:30
1

You could also extend the Dictionary type to add an initializer that accepts an array of tuples:

extension Dictionary
{
  init(_ keyValues:[(Key,Value)] )
  {
     self.init(minimumCapacity: keyValues.underestimatedCount) // self.init()
     for (key,value) in keyValues { self[key] = value }
  }
}

struct MyStruct {

   var key: Int
   var value: String
}

let array = [MyStruct(key: 0, value: "a"), MyStruct(key: 1, value: "b")]
let dict = Dictionary(array.map{($0.key,$0.value)})

The for loop would still be there but only within the Dictionary type and would not require boiler plate code to build it form an array in the various places where you need such an initialization.

[EDIT] changed init to use minimum capacity as suggested by 3441734. This should make it as fast as #1. I feel, however that this optimization sacrifices a bit of simplicity for the sake of a very rare use case where such initializations would be a key performance factor.

Alain T.
  • 40,517
  • 4
  • 31
  • 51