Python: How can one verify that dict1 is only a subset of dict2? Values are all int and within scope

Question

I'm trying to build some efficient code that can tell if one dict is a subset of another. Both dicts have string keys and int values. For dict1 to be considered a subset, it can not contain any unique keys and all values must be less than or equal to the equivalent key's value in dict2.

This almost worked: test_dict.items() <= test_dict2.items() until I tested it here:

test_dict = {
    'a':1,
    'c':2
}

test_dict2 = {
    'a':1,
    'b':2,
    'c':3
}

test_dict.items() <= test_dict2.items()

False

I did get something working, but I dont know how efficient it really is

def test(request_totals, mongo_totals, max_limit=100):
    outdated = dict()
    
    sharedKeys = set(request_totals.keys()).intersection(mongo_totals.keys())
    unsharedKeys = set(request_totals) - set(mongo_totals)
    
    # Verifies MongoDB contains no unique collections
    if set(mongo_totals) - set(request_totals) != set():
        raise AttributeError(f'''mongo_totals does not appear to be a subset of request_totals. 
                            Found: {set(mongo_totals) - set(request_totals)}''')
    
    # Updates outdated dict with outdated key-value pairs representing MongoDB collections
    for key in sharedKeys:
        if request_totals[key] > mongo_totals[key]:
            outdated.update({key : range(mongo_totals[key], request_totals[key])})
        elif request_totals[key] < mongo_totals[key]:
            raise AttributeError(
                f'mongo_total for {key}: {mongo_totals[key]} exceeds request_totals for {key}: {request_totals[key]}')
    
    return outdated

test(request_totals, mongo_totals)

It seems like a lot to do my comparison before creating an object that manages updates. Is there a better way to do this?

Some possible solutions [here](https://stackoverflow.com/questions/9323749/how-to-check-if-one-dictionary-is-a-subset-of-another-larger-dictionary) that may or may not meet your exact needs. — sj95126, Aug 05 '22 at 20:42

score 2 · Answer 1 · answered Aug 05 '22 at 21:07

2

all(test_dict2.get(k, v-1) >= v
    for k, v in test_dict.items())

Try it online!

answered Aug 05 '22 at 21:07

Kelly Bundy

23,480
7
29
65

Daniel Hao · Accepted Answer · 2022-08-05T21:06:55.563

0

You could try the collections Counter - it's very efficient and clear. Note - it's a new feature available in Python 3.10.

It's dict subclass for counting hashable objects


from collections import Counter
cd1 = Counter(test_dict)
cd2 = Counter(test_dict2)
print(cd1 <= cd2)
# True
#
# another example:
cd3 = Counter({'a': 2, 'b': 2, 'c': 3})
print(cd3 <= cd2)
#False
print(cd2 <= cd3)
#True

edited Aug 05 '22 at 21:06

answered Aug 05 '22 at 20:44

Daniel Hao

4,922
3
10
23

Thanks for your reply! What version of python are you using? For me, this returns `TypeError: '<=' not supported between instances of 'Counter' and 'Counter'` I'm on 3.9 – MrChadMWood Aug 05 '22 at 20:51
I would have loved to upgrade, but I was reading there's no support for BeautifulSoup just yet. Now that I think about it though, I didn't check if BS4 was available... whoops. Have you noticed any good reason to stay at 3.9 for production? – MrChadMWood Aug 05 '22 at 20:59
Py 3.10 has been around about 1 year (Oct, 2021) - there's no reason not to upgrade... but some place/project might be slower. Does this post helps you? – Daniel Hao Aug 05 '22 at 21:05
1

Good point. I was going to wait until 3.11 official release and just stay 1 version behind, but maybe I'll go ahead and upgrade over the weekend. Thanks for answering my questions! – MrChadMWood Aug 05 '22 at 21:06
Yes, I'm accepting this as the answer. Your solution appears to be valid as long as I'm utilizing current software, and I did not request code be compatible with older versions. So it makes sense to me that I should accept your answer. I'll upgrade as well. Thanks again. – MrChadMWood Aug 05 '22 at 21:09
1

They didn't downvote mine, so maybe it's about efficiency? (I suspect yours is faster if it is a subset but mine might be much faster if it isn't). Btw in <3.10, you could subtract and check if something is left over. – Kelly Bundy Aug 06 '22 at 11:13
Thanks for the feedback. I will check out the *timing* if I got some time this weekend... But again *premature optimization is evil...* Should be concerned to make it work first...right? – Daniel Hao Aug 06 '22 at 11:16
Yeah, I don't really see it as an issue, I just don't see any other issue with it, either. – Kelly Bundy Aug 06 '22 at 11:18

Python: How can one verify that dict1 is only a subset of dict2? Values are all int and within scope

2 Answers2