Say I have an array of strings:
let arrayOfStrings = ["a", "b", "a", "c", "a", "d"]
How would I get rid of the duplicates?
Say I have an array of strings:
let arrayOfStrings = ["a", "b", "a", "c", "a", "d"]
How would I get rid of the duplicates?
You can use the array function contains(_:)
to check if an element is already part of the array, but that is fairly slow, and for large arrays it won’t perform well. (1.) Better to copy the entries into a Set
and use Set
operations to find and remove the duplicates. Sets are optimized to make testing for set membership fast, so if aSet.contains(item)
is a lot faster than if anArray.contains(item)
.
If you don't care about preserving the order of your items, you can simply copy your array into a set and then back to an array. However, that does mean that the items in the resulting array will be in a different order.
A function to remove duplicates from an array of strings, while preserving the order, might look like this:
func uniqueElementsFrom(array: [String]) -> [String] {
//Create an empty Set to track unique items
var set = Set<String>()
let result = array.filter {
guard !set.contains($0) else {
//If the set already contains this object, return false
//so we skip it
return false
}
//Add this item to the set since it will now be in the array
set.insert($0)
//Return true so that filtered array will contain this item.
return true
}
return result
}
If you call it with code like this:
let arrayOfStrings = ["a", "b", "a", "c", "a", "d"]
let uniqueStrings = uniqueElementsFrom(array:arrayOfStrings)
print("Unique elements from \(arrayOfStrings) = \n” +
“\(uniqueStrings)")
The output would be
Unique elements from ["a", "b", "a", "c", "a", "d"] =
[“a”, "b", "c", "d"]
However, that function only works with arrays of strings. It would be good if we could write a function that could remove duplicates from any kind of array.
This is a job for Generics. There is a catch however. Sets can only contain objects that conform to the Hashable
protocol, since Sets use hashes to make testing for set membership faster.
We can rewrite the uniqueElementsFrom(array:)
function to take any array that conforms to the Hashable
protocol using Generics. That code looks like this:
func uniqueElementsFrom<T: Hashable>(array: [T]) -> [T] {
var set = Set<T>()
let result = array.filter {
guard !set.contains($0) else {
return false
}
set.insert($0)
return true
}
return result
}
The <T: Hashable>
bit after the function name says "The rest of this function will refer to a type T which is unspecified. The only thing you can be sure of is that the type T will conform to the Hashable protocol."
This form of the uniqueElementsFrom(array:)
function will work on any array who’s elements are Hashable
.
(1.) For arrays, contains(_:)
has O(n)
performance, and so looping through an array, testing the array to see if it contains each new element with contains(_:)
has performance that's almost O(n^2)
, which is really, really bad for anything but a small array. I'm pretty sure that Set
's contains(_:)
function has constant time performance, so the whole process would have O(n)
performance.