The implementation in question can be found here. The algorithm goes something like this, given that we want to check if two sets
v
and w
are equal:
- Check if their sizes are equal. If they are not, return
false
.
This references the size
attribute of the PySetObject
, so it's a constant time operation.
- Check if both
sets
are frozensets
. If so, and their hashes are different, return false
.
Again, this references the hash
attribute of the PySetObject
, which is a constant time operation. This is possible because frozensets
are immutable objects and hence have their hashes calculated when they are created.
- Check if
w
is a subset of v
.
This is done by iterating through each element of v
and checking if it is present in w
.
The iteration must be performed over all of v
, and is therefore at worst linear in its size (if a differing element is found in the last position).
Membership testing for sets
in general takes constant time in the average case and linear time in the worst case; combining the two gives time linear in the size of v
in the average case and time proportional to len(v) * len(w)
in the worst case.
This is, in a sense, the average case of the worst case, since it assumes that the first two checks always return true
. If your data only rarely tends to have equal-size sets
which also have the same hashes but contain different objects, then in the average case the comparison will still run in constant time.