2

The frozenset docs says:

The frozenset type is immutable and hashable — its contents cannot be altered after it is created; it can therefore be used as a dictionary key or as an element of another set.

However, the docs for for python sets says:

Since sets only define partial ordering (subset relationships), the output of the list.sort() method is undefined for lists of sets.

This makes me ask: why is the case? And, if I wanted to sort a list of sets by set content, how could I do this? I know that the extension intbitset: https://pypi.python.org/pypi/intbitset/2.3.0 , has a function for returning a bit sequence that represents the set contents. Is there something comparable for python sets?

Travis Black
  • 705
  • 7
  • 18
  • Please provide a code example of what you can't do and then we can provide a code example of how to do it. – Callam Delaney Mar 04 '18 at 03:06
  • Okay, so I understand this request in some cases. But what in the world do I need a code example for here. What is confusing about this question that a code example would clear up? I was pretty specific. – Travis Black Mar 04 '18 at 03:16
  • 1
    Possible duplicate of [How does Python sort a list of sets?](https://stackoverflow.com/questions/42956966/how-does-python-sort-a-list-of-sets) – Brad Solomon Mar 04 '18 at 03:29

2 Answers2

3

Tuples, lists, strings, etc. have a natural lexicographic ordering and can be sorted because you can always compare two elements of a given collection. That is, either a < b, b < a, or a == b.

A natural comparison between two sets is having a <= b mean a is a subset of b, which is what the expression a <= b actually does in Python. What the documentation means by "partial ordering" is that not all sets are comparable. Take, for example, the following sets:

a = {1, 2, 3}
b = {4, 5, 6}

Is a a subset of b? No. Is b a subset of a? No. Are they equal? No. If you can't compare them at all, you clearly can't sort them.

The only way you can sort a collection of sets is if your comparison function actually can compare any two elements (a total order). This means you can still sort a collection of sets using the above subset relation, but you will have to ensure that all of the sets are comparable (e.g. [{1}, {1, 2, 4}, {1, 2}]).

The easiest way to do what you want is to transform each individual set into something that you actually can compare. Basically, you do f(a) <= f(b) (where <= is obvious) for some simple function f. This is done with the key keyword argument:

In [10]: def f(some_set):
   ...       return max(some_set)
   ...

In [11]: sorted([{1, 2, 3, 999}, {4, 5, 6}, {7, 8, 9}], key=f)
Out[11]: [{4, 5, 6}, {7, 8, 9}, {1, 2, 3, 999}]

You're sorting [f(set1), f(set2), f(set3)] and applying the resulting ordering to [set1, set2, set3].

Blender
  • 289,723
  • 53
  • 439
  • 496
3

Take an example: say you wanted to sort a list of sets by the "first element" of each set. The issue is that Python sets or frozensets don't have a "first element." They have no sense of their own ordering. A set is an unordered collection with no duplicate elements.

Furthermore, list.sort() sorts the list in place, using only the < operator between items.

If you just use a.sort() without passing any key parameter, saying set_a < set_b (or set_a.__lt__(set_b)) is insufficient. By insufficient, I mean that set_a.__lt__(set_b) is a subset operator. (Is a a subset of b?). As mentioned by @Blender and referenced in your question, this provides for partial rather than total ordering, which is insufficient for defining what ever sequence holds the sets.

From the docs:

set < other: Test whether the set is a proper subset of other, that is, set <= other and set != other.

You could pass a key to sort(), it just couldn't refer to anything to do with the "ordering" of the sets internally, because remember--there is none.

>>> a = {2, 3, 1}
>>> b = {6, 9, 0, 1}
>>> c = {0}
>>> i = [b, a, c]
>>> i.sort(key=len)
>>> i
[{0}, {1, 2, 3}, {0, 9, 6, 1}]
Brad Solomon
  • 38,521
  • 31
  • 149
  • 235