41

The question here involves removing duplicate objects from an array:

Removing duplicate elements from an array in Swift

I instead need to remove objects that are not themselves duplicates, but have specific duplicate properties such as id.


I have an array containing my Post objects. Every Post has an id property.

Is there a more effective way to find duplicate Post ID's in my array than

for post1 in posts {
    for post2 in posts {
        if post1.id == post2.id {
            posts.removeObject(post2)
        }
    }
}
pkamb
  • 33,281
  • 23
  • 160
  • 191
Oscar Apeland
  • 6,422
  • 7
  • 44
  • 92
  • 2
    http://stackoverflow.com/questions/25738817/does-there-exist-within-swifts-api-an-easy-way-to-remove-duplicate-elements-fro – dfrib Jan 10 '16 at 18:21
  • First result in google search for "Remove duplicate objects in an array swift" goes to @dfri's link... – Cristik Jan 10 '16 at 18:23
  • 2
    @Cristik The linked reply did not answer my question, as I want to check for duplicate properties in an object, not duplicate objects. – Oscar Apeland Jan 10 '16 at 18:28
  • @OscarApeland please then update the question, nor from the title or the contents cleary results that you need the array filtered by a property, not by the object itself. – Cristik Jan 10 '16 at 18:39
  • @Cristik "way to find duplicate Post ID's" and its obviously what the code tries to accomplish. – Oscar Apeland Jan 10 '16 at 18:44
  • @OscarApeland Have a look at Daniel Krom:s answer in the duplicate link. Just make your `Post` object hashable (implicitly equatable via `id` preoperty) and function `uniq` will be your solution. – dfrib Jan 10 '16 at 18:44
  • Then that should have been your answer, not a close vote. I thought this was a site to help other programmers, not yell at them for not being as good as you at Swift. – Oscar Apeland Jan 10 '16 at 18:51
  • @OscarApeland edited questions go to the reopen queue, that's why I asked you to update also the title to reflect better what you're seeking for. BTW, nobody yelled at you on your Swift skills. – Cristik Jan 10 '16 at 18:56
  • 2
    @OscarApeland That was never my intention, my apologies if you perceived it as such. I simply marked this question (in it's current form) as a duplicate, mostly due to it's title. As Cristik writes, edited questions can possibly be re-opened again. Next time, possibly be more specific, e.g. asking how to use the linked approach (duplicates of "regular" array) to apply to your array of objects, specifically their property `id`. Finally, we're all here to learn and teach each other, never be afraid to asks questions, we also learn from how we ask, how to ask next time. – dfrib Jan 10 '16 at 18:58

16 Answers16

61

I am going to suggest 2 solutions.

Both approaches will need Post to be Hashable and Equatable

Conforming Post to Hashable and Equatable

Here I am assuming your Post struct (or class) has an id property of type String.

struct Post: Hashable, Equatable {
    let id: String
    var hashValue: Int { get { return id.hashValue } }
}

func ==(left:Post, right:Post) -> Bool {
    return left.id == right.id
}

Solution 1 (losing the original order)

To remove duplicated you can use a Set

let uniquePosts = Array(Set(posts))

Solution 2 (preserving the order)

var alreadyThere = Set<Post>()
let uniquePosts = posts.flatMap { (post) -> Post? in
    guard !alreadyThere.contains(post) else { return nil }
    alreadyThere.insert(post)
    return post
}
Luca Angeletti
  • 58,465
  • 13
  • 121
  • 148
  • 2
    Thank you for your answer. Sadly, order matters to me, as I want the posts to show up chronologically. – Oscar Apeland Jan 10 '16 at 18:27
  • @OscarApeland: No problem, I added another section (Solution 2) where the original order is preserved. – Luca Angeletti Jan 10 '16 at 18:29
  • when creating Set, the Set internally check if it contains(post). than you do the same again. I don't see any advantage of this approach. Even though my own answer was down-voted, conformance to Hashable and creating the Set is not necessary and finally less effective. please, check my answer ... – user3441734 Jan 12 '16 at 13:44
  • @user3441734: I didn't downvote your question. I think you are talking about my `Solution 2`. The `alreadyThere.contains` in my code is mandatory because I need to know if the `post` I am evaluating is already present in the array I am building and returning. You cannot remove that part. if you want unique values. – Luca Angeletti Jan 12 '16 at 13:57
  • @user3441734: Furthermore this sentence is wrong `when creating Set, the Set internally check if it contains(post). than you do the same again` because I am building an **empty Set** (Solution 2). – Luca Angeletti Jan 12 '16 at 13:58
  • @appzYourLife i see. internally, because the Set doesn't allow duplicates, it checks the same again, doesn't it? – user3441734 Jan 12 '16 at 15:00
  • @appzYourLife i don't take care about down- voting! I ask you, to be that clear enough for myself. – user3441734 Jan 12 '16 at 15:02
  • @user3441734: I am using the `Set` to check if the `i-th` post is already present in the array I am building. I the `i-th` is present in the `Set alreadyThere` then I discard it, otherwise I add it both to `alreadyThere` and to the array I am building with the `flatMap` function. – Luca Angeletti Jan 12 '16 at 15:03
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/100487/discussion-between-user3441734-and-appzyourlife). – user3441734 Jan 12 '16 at 15:04
  • 1
    Ive been coding in Swift for almost 2 years and I only understand Hashable and Equatable with your explanation :) I thought it was a super function I will not able to understand. thanks – J. Goce Jan 25 '18 at 11:32
  • If anyone is still looking at this in 2021, I don't see where HashValue is used and why it's necessary. Help pls. Still learning. I just saw a warning "'Hashable.hashValue' is deprecated as a protocol requirement; conform type 'Post' to 'Hashable' by implementing 'hash(into:)' instead" was it because it used to be a requirement? – Jayson Jul 20 '21 at 04:03
31

You can create an empty array "uniquePosts", and loop through your array "Posts" to append elements to "uniquePosts" and every time you append you have to check if you already append the element or you didn't. The method "contains" can help you.

func removeDuplicateElements(posts: [Post]) -> [Post] {
    var uniquePosts = [Post]()
    for post in posts {
        if !uniquePosts.contains(where: {$0.postId == post.postId }) {
            uniquePosts.append(post)
        }
    }
    return uniquePosts
}
skymook
  • 3,401
  • 35
  • 39
marouan azizi
  • 387
  • 3
  • 5
  • 3
    Code-only answers are considered low quality: make sure to provide an explanation what your code does and how it solves the problem. – help-info.de Oct 01 '19 at 21:05
7

A generic solution which preserves the original order is:

extension Array {
    func unique(selector:(Element,Element)->Bool) -> Array<Element> {
        return reduce(Array<Element>()){
            if let last = $0.last {
                return selector(last,$1) ? $0 : $0 + [$1]
            } else {
                return [$1]
            }
        }
    }
}

let uniquePosts = posts.unique{$0.id == $1.id }
Mike Neilens
  • 91
  • 2
  • 2
4

My solution on Swift 5:

Add Extension:

extension Array where Element: Hashable {

    func removingDuplicates<T: Hashable>(byKey key: (Element) -> T)  -> [Element] {
         var result = [Element]()
         var seen = Set<T>()
         for value in self {
             if seen.insert(key(value)).inserted {
                 result.append(value)
             }
         }
         return result
     }

}

Class Client, important have the class like Hashable :

struct Client:Hashable {

   let uid :String
   let notifications:Bool

   init(uid:String,dictionary:[String:Any]) {
       self.uid = uid
       self.notifications = dictionary["notificationsStatus"] as? Bool ?? false
   }

   static func == (lhs: Client, rhs: Client) -> Bool {
    return lhs.uid == rhs.uid
   }

}

Use:

arrayClients.removingDuplicates(byKey: { $0.uid })

Have a good day swift lovers ♥️

Danielvgftv
  • 547
  • 7
  • 6
4

There is a good example from this post

Here is an Array extension to return the unique list of objects based on a given key:

extension Array {
    func unique<T:Hashable>(map: ((Element) -> (T)))  -> [Element] {
        var set = Set<T>() //the unique list kept in a Set for fast retrieval
        var arrayOrdered = [Element]() //keeping the unique list of elements but ordered
        for value in self {
            if !set.contains(map(value)) {
                set.insert(map(value))
                arrayOrdered.append(value)
            }
        }

        return arrayOrdered
    }
}

for your example do:

let uniquePosts = posts.unique{$0.id ?? ""}
Max Niagolov
  • 684
  • 9
  • 26
3

my 'pure' Swift solutions without Post conformance to Hashable (required by Set )

struct Post {
    var id: Int
}

let posts = [Post(id: 1),Post(id: 2),Post(id: 1),Post(id: 3),Post(id: 4),Post(id: 2)]

// (1)
var res:[Post] = []
posts.forEach { (p) -> () in
    if !res.contains ({ $0.id == p.id }) {
        res.append(p)
    }
}
print(res) // [Post(id: 1), Post(id: 2), Post(id: 3), Post(id: 4)]

// (2)
let res2 = posts.reduce([]) { (var r, p) -> [Post] in
    if !r.contains ({ $0.id == p.id }) {
        r.append(p)
    }
    return r
}

print(res2) // [Post(id: 1), Post(id: 2), Post(id: 3), Post(id: 4)]

I prefer (1) encapsulated into function (aka func unique(posts:[Post])->[Post] ), maybe an extension Array ....

user3441734
  • 16,722
  • 2
  • 40
  • 59
3

(Updated for Swift 3)

As I mentioned in my comment to the question, you can make use of a modified Daniel Kroms solution in the thread we previously marked this post to be duplicate of. Just make your Post object hashable (implicitly equatable via id property) and implement a modified (using Set rather than Dictionary; the dict value in the linked method is not used anyway) version of Daniel Kroms uniq function as follows:

func uniq<S: Sequence, E: Hashable>(_ source: S) -> [E] where E == S.Iterator.Element {
    var seen = Set<E>()
    return source.filter { seen.update(with: $0) == nil }
}

struct Post : Hashable {
    var id : Int
    var hashValue : Int { return self.id }
}

func == (lhs: Post, rhs: Post) -> Bool {
    return lhs.id == rhs.id
}

var posts : [Post] = [Post(id: 1), Post(id: 7), Post(id: 2), Post(id: 1), Post(id: 3), Post(id: 5), Post(id: 7), Post(id: 9)]
print(Posts)
/* [Post(id: 1), Post(id: 7), Post(id: 2), Post(id: 1), Post(id: 3), Post(id: 5), Post(id: 7), Post(id: 9)] */


var myUniquePosts = uniq(posts)
print(myUniquePosts)
/* [Post(id: 1), Post(id: 7), Post(id: 2), Post(id: 3), Post(id: 5), Post(id: 9)] */

This will remove duplicates while maintaining the order of the original array.


Helper function uniq as a Sequence extension

Alternatively to using a free function, we could implement uniq as a constrained Sequence extension:

extension Sequence where Iterator.Element: Hashable {
    func uniq() -> [Iterator.Element] {
        var seen = Set<Iterator.Element>()
        return filter { seen.update(with: $0) == nil }
    }
}

struct Post : Hashable {
    var id : Int
    var hashValue : Int { return self.id }
}

func == (lhs: Post, rhs: Post) -> Bool {
    return lhs.id == rhs.id
}

var posts : [Post] = [Post(id: 1), Post(id: 7), Post(id: 2), Post(id: 1), Post(id: 3), Post(id: 5), Post(id: 7), Post(id: 9)]
print(posts)
/* [Post(id: 1), Post(id: 7), Post(id: 2), Post(id: 1), Post(id: 3), Post(id: 5), Post(id: 7), Post(id: 9)] */


var myUniquePosts = posts.uniq()
print(myUniquePosts)
/* [Post(id: 1), Post(id: 7), Post(id: 2), Post(id: 3), Post(id: 5), Post(id: 9)] */
dfrib
  • 70,367
  • 12
  • 127
  • 192
3

Preserving order, without adding extra state:

func removeDuplicates<T: Equatable>(accumulator: [T], element: T) -> [T] {
    return accumulator.contains(element) ?
        accumulator :
        accumulator + [element]
}

posts.reduce([], removeDuplicates)
mbdavis
  • 3,861
  • 2
  • 22
  • 42
3

This works for multidimensional arrays as well:

for (index, element) in arr.enumerated().reversed() {
    if arr.filter({ $0 == element}).count > 1 {
        arr.remove(at: index)
    }
}
IKavanagh
  • 6,089
  • 11
  • 42
  • 47
vitas168
  • 31
  • 2
2

In swift 3 refer below code:

let filterSet = NSSet(array: orignalArray as NSArray as! [NSObject])
let filterArray = filterSet.allObjects as NSArray  //NSArray
 print("Filter Array:\(filterArray)")
Richard Telford
  • 9,558
  • 6
  • 38
  • 51
Kiran K
  • 919
  • 10
  • 17
2

based on Danielvgftv's answer, we can rewrite it by leveraging KeyPaths, as follows:

extension Sequence {
    func removingDuplicates<T: Hashable>(withSame keyPath: KeyPath<Element, T>) -> [Element] {
        var seen = Set<T>()
        return filter { element in
            guard seen.insert(element[keyPath: keyPath]).inserted else { return false }
            return true
        }
    }
}

Usage:

struct Car {
    let id: UUID = UUID()
    let manufacturer: String
    // ... other vars
}

let cars: [Car] = [
    Car(manufacturer: "Toyota"),
    Car(manufacturer: "Tesla"),
    Car(manufacturer: "Toyota"),
]

print(cars.removingDuplicates(withSame: \.manufacturer)) // [Car(manufacturer: "Toyota"), Car(manufacturer: "Tesla")]
1

use a Set

To use it, make your Post hashable and implement the == operator

import Foundation

class Post: Hashable, Equatable {
    let id:UInt
    let title:String
    let date:NSDate
    var hashValue: Int { get{
            return Int(self.id)
        }
    }

    init(id:UInt, title:String, date:NSDate){
        self.id = id
        self.title = title
        self.date = date

    }

}
func ==(lhs: Post, rhs: Post) -> Bool {
    return lhs.id == rhs.id
}



let posts = [Post(id: 11, title: "sadf", date: NSCalendar.currentCalendar().dateFromComponents({let c = NSDateComponents(); c.day = 1; c.month = 1; c.year = 2016; return c}())!),
             Post(id: 33, title: "sdfr", date: NSCalendar.currentCalendar().dateFromComponents({let c = NSDateComponents(); c.day = 3; c.month = 1; c.year = 2016; return c}())!),
             Post(id: 22, title: "sdfr", date: NSCalendar.currentCalendar().dateFromComponents({let c = NSDateComponents(); c.day = 1; c.month = 12; c.year = 2015; return c}())!),
             Post(id: 22, title: "sdfr", date: NSCalendar.currentCalendar().dateFromComponents({let c = NSDateComponents(); c.day = 1; c.month = 12; c.year = 2015; return c}())!)]

Create set from array with duplicates

let postsSet = Set(posts)

This is unordered, create a new array, apply order.

let uniquePosts = Array(postsSet).sort { (p1, p2) -> Bool in
    return p1.date.timeIntervalSince1970 < p2.date.timeIntervalSince1970
}

Instead of making your Post model hashable, you could also use a wrapper class. This wrapper class would use the post objects property to calculate the hash and equality.
this wrapper could be configurable through closure:

class HashableWrapper<T>: Hashable {
    let object: T
    let equal: (obj1: T,obj2: T) -> Bool
    let hash: (obj: T) -> Int

    var hashValue:Int {
        get {
            return self.hash(obj: self.object)
        }
    }
    init(obj: T, equal:(obj1: T, obj2: T) -> Bool, hash: (obj: T) -> Int) {
        self.object = obj
        self.equal = equal
        self.hash = hash
    }

}

func ==<T>(lhs:HashableWrapper<T>, rhs:HashableWrapper<T>) -> Bool
{
    return lhs.equal(obj1: lhs.object,obj2: rhs.object)
}

The Post could be simply

class Post {
    let id:UInt
    let title:String
    let date:NSDate

    init(id:UInt, title:String, date:NSDate){
        self.id = id
        self.title = title
        self.date = date
    }
}

Let's create some post as before

let posts = [
    Post(id: 3, title: "sadf", date: NSCalendar.currentCalendar().dateFromComponents({let c = NSDateComponents(); c.day = 1; c.month = 1; c.year = 2016; return c}())!),
    Post(id: 1, title: "sdfr", date: NSCalendar.currentCalendar().dateFromComponents({let c = NSDateComponents(); c.day = 3; c.month = 1; c.year = 2016; return c}())!),
    Post(id: 2, title: "sdfr", date: NSCalendar.currentCalendar().dateFromComponents({let c = NSDateComponents(); c.day = 1; c.month = 12; c.year = 2015; return c}())!),
    Post(id: 2, title: "sdfr", date: NSCalendar.currentCalendar().dateFromComponents({let c = NSDateComponents(); c.day = 1; c.month = 12; c.year = 2015; return c}())!),
    Post(id: 1, title: "sdfr", date: NSCalendar.currentCalendar().dateFromComponents({let c = NSDateComponents(); c.day = 3; c.month = 1; c.year = 2016; return c}())!)
]

Now we create wrapper objects for every post with closure to determine equality and the hash. And we create the set.

let wrappers = posts.map { (p) -> HashableWrapper<Post> in
    return HashableWrapper<Post>(obj: p, equal: { (obj1, obj2) -> Bool in
            return obj1.id == obj2.id
        }, hash: { (obj) -> Int in
            return Int(obj.id)
    })
}

let s = Set(wrappers)

Now we extract the wrapped objects and sort it by date.

let objects = s.map { (w) -> Post in
    return w.object
}.sort { (p1, p2) -> Bool in
    return p1.date.timeIntervalSince1970 > p2.date.timeIntervalSince1970
}

and

print(objects.map{$0.id})

prints

[1, 3, 2]
vikingosegundo
  • 52,040
  • 14
  • 137
  • 178
  • Thank you for the massive effort. I chose to accept the other answer because your solution used Date to sort, and my post objects didn't have a date property initially. (Stupid API..). I wish I could accept yours too. – Oscar Apeland Jan 11 '16 at 00:07
  • That is the point in my last example: by using a configurable equality you could also check the index in the array. Also the wrapper object could have a property to store the index in the array. – vikingosegundo Jan 11 '16 at 00:10
1

This is a less-specialized question than the more popular variant found here: https://stackoverflow.com/a/33553374/652038

Use that answer of mine, and you can do this:

posts.firstUniqueElements(\.id)
0

Instead of using a hashable object, you could just use a set. Take an attribute value that you want to remove duplicates for and use that as your test value. In my example, I am checking for duplicate ISBN values.

do {
    try fetchRequestController.performFetch()
    print(fetchRequestController.fetchedObjects?.count)
    var set = Set<String>()
    for entry in fetchRequestController.fetchedObjects! {
        if set.contains(entry.isbn!){
            fetchRequestController.managedObjectContext.delete(entry)
        }else {
            set.insert(entry.isbn!)
        }
    }
    try fetchRequestController.performFetch()
    print(fetchRequestController.fetchedObjects?.count) 
    } catch {
    fatalError()
}
0

Swift 3.1 Most Elegant Solution (Thanx dfri)

Apple Swift version 3.1 (swiftlang-802.0.51 clang-802.0.41)

func uniq<S: Sequence, E: Hashable>(source: S) -> [E] where E==S.Iterator.Element {
    var seen: [E:Bool] = [:]
    return source.filter({ (v) -> Bool in
        return seen.updateValue(true, forKey: v) == nil
    })
}

struct Post : Hashable {
    var id : Int
    var hashValue : Int { return self.id }
}

func == (lhs: Post, rhs: Post) -> Bool {
    return lhs.id == rhs.id
}

var Posts : [Post] = [Post(id: 1), Post(id: 7), Post(id: 2), Post(id: 1), Post(id: 3), Post(id: 5), Post(id: 7), Post(id: 9)]
print(Posts)
/* [Post(id: 1), Post(id: 7), Post(id: 2), Post(id: 1), Post(id: 3), Post(id: 5), Post(id: 7), Post(id: 9)] */


var myUniquePosts = uniq(source: Posts)

print(myUniquePosts)
/*[Post(id: 1), Post(id: 7), Post(id: 2), Post(id: 3), Post(id: 5), Post(id: 9)]*/
Community
  • 1
  • 1
maslovsa
  • 1,589
  • 1
  • 14
  • 15
  • 1
    Consider adding a comment to [the original answer](http://stackoverflow.com/a/34710228/4573247) next time (prompting for a Swift X.X update) _prior_ to copying and re-posting the answer with only a minor update. In case the original answer's author doesn't answer the prompt for an update (and you do not have sufficient rights to update the answer yourself), at that time it could be appropriate to duplicate the answer with an update for a newer language version. As this Q&A currently stands, we now have two exactly duplicated answers, as I've now updated my original answer to Swift 3. – dfrib Apr 15 '17 at 09:30
0
struct Post {
    var id: Int
}

extension Post: Hashable {
    var hashValue: Int {
        return id
    }

    static func == (lhs: Post, rhs: Post) -> Bool {
        return lhs.id == rhs.id
    }
}

and additional extension

public extension Sequence {
    func distinct<E: Hashable>() -> [E] where E == Iterator.Element {
        return Array(Set(self))
    }
}
GSerjo
  • 4,725
  • 1
  • 36
  • 55