problems with set on swift language

Question

I have created a class that adopt Hashable protocol.

So I created some instances of this class with different properties and added them to the Set.

Then I change a property of an object.

After this change the Set sometimes fail .contains (and also .remove). In the debugger inspector I see that the object has the same memory address of the element inside the Set. So why fail at random? Note that I can always found the index of the element inside the set.

I tested with playground (on xcode10) several times and on every execution the results change.

class Test: Hashable {
    // MARK: Equatable protocol
    static func == (lhs: Test, rhs: Test) -> Bool {
        return lhs === rhs || lhs.hashValue == rhs.hashValue
    }

    var s: String
    func hash(into hasher: inout Hasher) {
        hasher.combine(s.hashValue)

    }
    init(s: String) {
        self.s = s
    }
}

func test(s: Set<Test>, u: Test) -> Bool {
    if s.contains(u) {
        print("OK")
        return true
    } else {
        print("FAIL") // SOMETIMES FAIL
        if !s.contains(u) {
            if let _ = s.firstIndex(where: { $0 == u }) {
                print("- OK index") // ALWAYS OK
                return true
            } else {
                print("- FAIL index") // NEVER FAIL
                return false
            }
        } else {
            return true
        }
    }
}

var s: Set<Test> = []
let u1 = Test(s: "a")
s.insert(u1)
let u2 = Test(s: "b")
s.insert(u2)
test(s: s, u: u2)

u2.s = "c"
test(s: s, u: u2)
u2.s = "d"
test(s: s, u: u2)
u2.s = "b"
test(s: s, u: u2)

Why have you used a nested `if !s.contains(u)` inside the `else` clause? Can you please explain what are you trying to achieve/ — PGDev, Oct 24 '18 at 12:38
sorry, the nested if is a mistake but it does not affect the problem — user1409904, Oct 24 '18 at 14:24
Probably duplicate, mutatis mutandis, of https://stackoverflow.com/questions/28461675/why-dont-nsset-nsmutableset-nscountedset-force-immutable-objects-as-entries — matt, Oct 24 '18 at 15:42
A workaround could be: Store the hash in a private property `private var _hash:Int?`, and in `hashValue` it only once: `var hashValue: Int { if (_hash == nil) { _hash = s.hashValue }; return _hash! }` (to be threadsafe, use GCD `once` stuff) — Andreas Oetjen, Oct 26 '18 at 05:59

Sweeper · Answer 1 · 2018-10-24T13:00:02.930

2

Things in a set should be immutable.

You should never have put Test objects into a set because Test is completely mutable. This is exactly why you get these "strange and random" behaviour.

When you call contains, the set (or rather, the underlying hash table) evaluates the hash code of the parameter and see if the hash code matches any of the hash codes in the set. (Note that this is an oversimplification and it makes it sounds like this is a O(n) operation. It's not.)

Before you change u2, it has a hash code of x. The set remembers that u2 has a hash code of x. Now you change u2. It now has a different hash code y. The set thus can't find an element in it that has a hash code of y.

So basically, you should make sure that everything you put into a set has a constant hash code.

You can make Test immutable by doing:

let s: String

If you want to learn more you can look up how the set data structure is implemented in Swift. I found this post which might help as well.

edited Oct 24 '18 at 13:00

answered Oct 24 '18 at 12:53

Sweeper

213,210
22
193
313

Your answer might be correct (at least, for the Java part, it is correct), but I would definetly expect the NS(Set) documentation to contain a hint about this behavior... At least, I would expect the behavior not to be indeterministic. – Andreas Oetjen Oct 24 '18 at 13:07
@AndreasOetjen I looked at the Swift `Set` documentation and couldn't find anything. And if you look at [`NSSet.contains`](https://developer.apple.com/documentation/foundation/nsset/1414555-contains), it even suggests that the operation is O(n). Apple documentation is just not good enough... – Sweeper Oct 24 '18 at 13:11
Interestingly, it seems to rely on the seed of the `Hasher` somehow: If you run the whole thing in a loop, let's say 100 times (e.g. 100x create a new Set, add new Objects, modify them, test them), then one program run you get 100x "OK", and the next run, you get 100x "FAIL" (The `Hasher` changes it's seed each time the program runs), but never a mixture between OK and FAIL. – Andreas Oetjen Oct 24 '18 at 13:21
Ah OK. I thought that the objects being managed by reference were synchronized in the Set. Evidently, in order to optimize the code, somehow Set store in a cache the hash of its elements without constantly updating it. The strange thing is that sometimes the method works and sometimes not. In my tests it happened that in the same execution some tests failed and others did not... – user1409904 Oct 24 '18 at 14:01
@Sweeper Found documentation that shows you're absolutely spot on. – matt Oct 24 '18 at 15:50

matt · Accepted Answer · 2018-10-24T15:47:24.960

From the docs on NSSet:

If mutable objects are stored in a set, either the hash method of the objects shouldn’t depend on the internal state of the mutable objects or the mutable objects shouldn’t be modified while they’re in the set.

I think that exactly covers this case. True, it's about Cocoa NSSet, but I would expect Swift Set to correspond to NSSet in this regard.

Just for the record, I was able to reproduce the behavior you describe, eliminating some of the skankier or more questionable code - not in a playground, with a legal implementation of == and using hashValue, and without the unnecessary call to a test function:

class Test: Equatable, Hashable, CustomStringConvertible {
    var s: String
    static func == (lhs: Test, rhs: Test) -> Bool {
        return lhs.s == rhs.s
    }
    var hashValue: Int {
        return s.hashValue
    }
    init(s: String) {
        self.s = s
    }
    var description: String {
        return "Test(\(s))"
    }
}

class ViewController: UIViewController {

    override func viewDidLoad() {
        super.viewDidLoad()

        var s: Set<Test> = []
        let u1 = Test(s: "a")
        s.insert(u1)
        let u2 = Test(s: "b")
        s.insert(u2)

        print(s.contains(u2))
        u2.s = "c"
        print(s.contains(u2))
    }
}

To test it, run the project over and over. Sometimes you'll get true true and sometimes you'll get true false. The inconsistency is the indication that you shouldn't be mutating an object in a set.

I try to implement hashValue instead of hash method returning the hash of s property (so don't use Haser) but don' teseolve `class Test: Hashable { static func == (lhs: Test, rhs: Test) -> Bool { return lhs.hashValue == rhs.hashValue } var hashValue: Int { return s.hashValue } var s: String init(s: String) { self.s = s } }` — user1409904, Oct 24 '18 at 14:29
OK, sorry. I found documentation that shows Sweeper's answer is right. — matt, Oct 24 '18 at 15:41

score -1 · Answer 3 · answered Oct 24 '18 at 12:47

Since you are using the s property of Test class to create the hash values, try comparing s values instead of comparing the objects, i.e.

static func == (lhs: Test, rhs: Test) -> Bool
{
    return lhs.s == rhs.s
}

This will resolve your issue. Also, as I mentioned in the comment, there is no need to use an extra if-else in the failure case. You can simply use the below code:

func test(s: Set<Test>, u: Test) -> Bool
{
    if s.contains(u)
    {
        print("OK")
        return true
    }
    else
    {
        print("FAIL: \(u.s)") // SOMETIMES FAIL
        return false
    }
}

no, testing only the value of s don't resolve the problem. – user1409904 Oct 24 '18 at 14:25 — user1409904, Oct 24 '18 at 14:25

problems with set on swift language

3 Answers3