Here's what I managed:
var map: [String : (bef: Int?, aft: Int?)] = [:]
for (idx, (bef, aft)) in zipWithPadding(before, after).enumerate()
where bef?.id != aft?.id {
bef.map{map[$0.id] = (idx, map[$0.id]?.aft)}
aft.map{map[$0.id] = (map[$0.id]?.bef, idx)}
}
for (val, id) in map {
switch id {
case (_, nil): print("\(val): del at \(id.bef!)")
case (nil, _): print("\(val): ins at \(id.aft!)")
default: print("\(val): mov from \(id.bef!) to \(id.aft!)")
}
}
//D: del at 3
//E: del at 4
//F: ins at 0
//B: mov from 1 to 3
//A: mov from 0 to 2
//C: mov from 2 to 1
This method is pretty much the same as the other map answers, except it has one fewer loop, and it skips values that are the same in each array. The map
here is a dictionary of Strings (the id
s in your array) and tuples. The tuples are Int
s, corresponding to the index of a given id
in your first array, and the index of that same id
in the second. The Int
s are optional: this is how we figure out what happened to each id
. If the first is nil, and the second is not, then that id
was inserted. If the second is nil, though, it was removed. And if both Int
s are not nil, then that id
has been moved from the first to the second.
The way to fill up the map is by looping through the output of the zipWithPadding
function, which is here:
func zipWithPadding <
S0: SequenceType, S1: SequenceType, E0, E1 where
S0.Generator.Element == E0, S1.Generator.Element == E1
> (s0: S0, _ s1: S1) -> AnyGenerator<(E0?, E1?)> {
var (g0, g1) :
(S0.Generator?, S1.Generator?) =
(s0.generate(), s1.generate())
return anyGenerator {
let e0: E0? = g0?.next() ?? {g0 = nil; return nil}()
let e1: E1? = g1?.next() ?? {g1 = nil; return nil}()
return (e0 != nil || e1 != nil) ? (e0, e1) : nil
}
}
I got it from here. The reason you can't use the standard library zip
is that it would terminate once either of the underlying sequences did. Here, though, the sequences are of different length. This zip function returns a generator of tuples of successive elements from its two sequence arguments. If either sequence finishes before the other, the subsequent tuples returned will have that sequence's value as nil. Here's an example:
Array(zipWithPadding([1, 2, 3], [1, 2]))
//[({Some 1}, {Some 1}), ({Some 2}, {Some 2}), ({Some 3}, nil)]
Because generators aren't guaranteed to continually return nil after they've returned nil once (a generator returns nil to signal that it's finished), you can't just keep calling the same generator for your tuple value. That's why the generator itself is set to nil once it returns nil: so that you don't call it anymore.
However, Array generators seem to return nil after the last value anyway. So, if you didn't mind undefined behaviour:
func zipWithPadding <
S0: SequenceType, S1: SequenceType, E0, E1 where
S0.Generator.Element == E0, S1.Generator.Element == E1
> (s0: S0, s1: S1) -> AnyGenerator<(E0?, E1?)> {
var (g0, g1) = (s0.generate(), s1.generate())
return anyGenerator {
let (e0, e1) = (g0.next(), g1.next())
return e0 != nil || e1 != nil ? (e0, e1) : nil
}
}
Once you have your generator to loop through, the rest of the idea is easy. The before and after ids are put into a dictionary, and if they weren't in the dictionary already, the corresponding index in the tuple is set to nil. (map[$0.id]?.aft
will return nil if $0.id
isn't in the dictionary).
In terms of efficiency, there are a few places I think this approach would work. It seems better that it's using only one loop and not two, but the custom zipWithPadding
function adds so much overhead the single loop is actually less efficient than two sequential loops. Similarly, using only one enumerate()
seems efficient, but again, the overhead isn't worth it. (it's worth noting that if the two arrays were the same length, the standard library zip
would give you a very fast option here)
This method does allow you to skip elements that are the same in both arrays, though, something that you can't do with two for loops. In some quick testing, it looks like arrays with more than a quarter of their elements in the same positions are more quickly handled by this method. This would also speed up the looping through the map afterwards: it would only contain differences, so it would be shorter, and you wouldn't need to check for changes.
UPDATE:
I was trying to figure out how to get rid of some of the overhead, especially with regards to the generator. I made a custom struct:
struct PaddedZipGenerator<G0: GeneratorType, G1: GeneratorType> : GeneratorType {
typealias E0 = G0.Element
typealias E1 = G1.Element
typealias Element = (E0?, E1?)
private var (g0, g1): (G0?, G1?)
mutating func next() -> PaddedZipGenerator.Element? {
let e0: E0? = g0?.next() ?? {g0 = nil; return nil}()
let e1: E1? = g1?.next() ?? {g1 = nil; return nil}()
return (e0 != nil || e1 != nil) ? (e0, e1) : nil
}
}
struct PaddedZip<S0: SequenceType, S1: SequenceType> : SequenceType {
typealias Generator = PaddedZipGenerator<S0.Generator, S1.Generator>
private let (s0, s1): (S0, S1)
func generate() -> PaddedZip.Generator {
return PaddedZipGenerator(g0: s0.generate(), g1: s1.generate())
}
}
func zipWithPadding<S0: SequenceType, S1: SequenceType>(s0: S0, _ s1: S1) -> PaddedZip<S0, S1> {
return PaddedZip(s0: s0, s1: s1)
}
And it seems like it worked! With some basic testing it seems like this zipWithPadding
function works quite fast. It seems to work faster than two for loops, even when both lists contain none of the same elements.