This query is a small twist on a commonly asked question.
The goal is to filter a 2d array to remove duplicate element pairs whose first-column values are duplicates. For example:
[[1, 1, 1, 2, 2, 3, 3, 3, 3], [0.1, 0.15, 0.2, 0.05, 0.1, 0.2, 0.25, 0.3, 0.35]]
-> [[1, 2, 3],[0.2, 0.1, 0.35]]
Since the second-column values vary, there is obviously some discretion that needs to be applied when filtering: here, the last value of the set of duplicates is chosen.
One of the myriad answers to this related question -- a functional programming solution by Tim MB -- can be adapted to the task:
// Use FP-style filtering to eliminate repeated elements
let rawArray: [[Float]] = [...]
let filteredArray = rawArray
.transpose
.enumerated()
.filter{ (rawArray[0]).lastIndex(of: $0.1[0]) == $0.0 }
.map{ $0.1 }
.transpose
However, this solution is rather slow, which is unfortunate because it's elegant.
A faster solution that keeps the FP spirit is to use dictionary hashing:
// Use array -> dict -> array trick to remove repeated elements
let rawArray: [[Float]] = [...]
let filteredArray = Array( Array(
rawArray
.transpose
.reduce(into: [:], { dict, elements in
dict[elements[0], default:(0,0)] = elements[1]
} )
.map{ ($0.key, $0.value) } )
.sorted{ $0.0 < $1.0 }
.map{ [$0.0, $0.1] }
.transpose) as! Array2D
My questions are:
- Is this dictionary trick a good idea? Given that it uses floats as keys?
- Why is the FP solution slow? Can it be speeded up?
- Are there better alternatives?