13

How to test if a python Counter is contained in another one using the following definition:

A Counter a is contained in a Counter b if, and only if, for every key k in a, the value a[k] is less or equal to the value b[k]. The Counter({'a': 1, 'b': 1}) is contained in Counter({'a': 2, 'b': 2}) but it is not contained in Counter({'a': 2, 'c': 2}).

I think it is a poor design choice but in python 2.x the comparison operators (<, <=, >=, >) do not use the previous definition, so the third Counter is considered greater-than the first. In python 3.x, instead, Counter is an unorderable type.

enrico.bacis
  • 30,497
  • 10
  • 86
  • 115
  • 2
    You should properly define "contained" to avoid confusion. – Shashank Apr 11 '15 at 08:26
  • `Counter` doesn't actually support comparison operators. – user2357112 Apr 11 '15 at 08:30
  • 1
    @JimDennis: We're supposed to be considering the Counter as a multiset, and that attempt doesn't take into account the multiplicity of elements. – user2357112 Apr 12 '15 at 08:10
  • @JimDennis: No, I don't want to check if all the keys are present, I want to check also the multiplicity as user2347112 said: *A Counter a is contained in a Counter b if, and only if, for every key k in a, the value a[k] is less or equal to the value b[k].* – enrico.bacis Apr 12 '15 at 08:27

5 Answers5

16

Update 2023: Counter supports rich comparison operators as of python 3.10, so this works:

container <= contained

Historical answer for python < 3.10:

The best I came up with is to convert the definition i gave in code:

def contains(container, contained):
    return all(container[x] >= contained[x] for x in contained)

But if feels strange that python don't have an out-of-the-box solution and I have to write a function for every operator (or make a generic one and pass the comparison function).

enrico.bacis
  • 30,497
  • 10
  • 86
  • 115
  • 3
    Agreed, given that `Counter` gives a couple of multiset-like operations, it's surprising it doesn't offer a `subset of` equivalent. I'd expect to just be able to do `counter_a <= counter_b` - Either way, this seems like the best solution possible to me. – Gareth Latty Apr 11 '15 at 08:39
15

While Counter instances are not comparable with the < and > operators, you can find their difference with the - operator. The difference never returns negative counts, so if A - B is empty, you know that B contains all the items in A.

def contains(larger, smaller):
    return not smaller - larger
Blckknght
  • 100,903
  • 11
  • 120
  • 169
  • 4
    I considered giving this answer, but I decided against it because this doesn't short-circuit and wastes a bunch of time and space building the Counter. Also, it has to iterate through *both* Counters due to how `-` handles negative counts in the second argument. The solution based on `all` is much more efficient. – user2357112 Apr 12 '15 at 07:51
1

For all the keys in smaller Counter make sure that no value is greater than its counterpart in the bigger Counter:

def containment(big, small):
    return not any(v > big[k] for (k, v) in small.iteritems())

>>> containment(Counter({'a': 2, 'b': 2}), Counter({'a': 1, 'b': 1}))
True
>>> containment(Counter({'a': 2, 'c': 2, 'b': 3}), Counter({'a': 2, 'b': 2}))
True
>>> print containment(Counter({'a': 2, 'b': 2}), Counter({'a': 2, 'b': 2, 'c':1}))
False
>>> print containment(Counter({'a': 2, 'c': 2}), Counter({'a': 1, 'b': 1})
False
Saksham Varma
  • 2,122
  • 13
  • 15
  • 1
    Which version of python are you using? In Python 2 it returns `True` for both the invocations, while in python 3 it throws an error because Counter is unorderable. Moreover you don't really need a function for that. – enrico.bacis Apr 12 '15 at 08:24
  • @enrico.bacis I copied it wrong before. It works fine on Py 2.7 I'm on. Thanks for catching that! – Saksham Varma Apr 12 '15 at 08:27
  • As I said in the question, the comparison operator only check the total number of elements in the counter. Try with my example: `Counter({'a': 2, 'b': 2}) >= Counter({'a': 1, 'b': 1})` gives `True` as expected, but `Counter({'a': 2, 'c': 2}) >= Counter({'a': 1, 'b': 1})` also gives `True` even if the right hand side is not *contained* in the left hand side. – enrico.bacis Apr 12 '15 at 08:32
  • @enrico.bacis I've added another check get around the case you mentioned. – Saksham Varma Apr 12 '15 at 08:42
  • Now it works but keep in mind that *not any( predicate )* is exactly the same as *all( not predicate )*, so the answer is semantically the same as my answer above ;) – enrico.bacis Apr 12 '15 at 09:29
  • @enrico.bacis True :D – Saksham Varma Apr 12 '15 at 09:39
  • Another couple of suggestions, you are using `.items()` to get also the value `v` while you should prefer `.iteritems()` which does not create an intermediate list but a generator, since you are not storing the values. You are also not using `v`. To be more efficient, you should use `not any(v > big[k] for (k, v) in small.iteritems())` or `not any(small[k] > big[k] for k in small)`. – enrico.bacis Apr 12 '15 at 09:45
1

Another, fairly succinct, way to express this:

"Counter A is a subset of Counter B" is equivalent to (A & B) == A.

That's because the intersection (&) of two Counters has the counts of elements common to both. That'll be the same as A if every element of A (counting multiplicity) is also in B; otherwise it will be smaller.

Performance-wise, this seems to be about the same as the not A - B method proposed by Blckknght. Checking each key as in the answer of enrico.bacis is considerably faster.

As a variation, you can also check that the union is equal to the larger Counter (so nothing was added): (A | B) == B. This is noticeably slower for some largish multisets I tested (1,000,000 elements).

Nick Matteo
  • 4,453
  • 1
  • 24
  • 35
0

While of historical interest, all these answers are obsolete. Counter class objects are in fact comparable enter image description here

user2872645
  • 21
  • 1
  • 3
  • Thanks for the answer! I would write the actual code into your answer though instead of pasting a screenshot, otherwise it forces people to have to hand-type everything you've done here (versus just copy and pasting your code). – Shawn Hemelstrand Feb 26 '23 at 03:26