Depending on the size of your dictionary, and the length of your key, neither checking all the keys in the dict nor enumerating all the subsets and checking those is a good solution. Instead, you could restructure your "flat" dictionary to something like a Trie, or Prefix Tree. Here, each element in the set will point to another branch of the tree and/or an actual value:
dictionary = {
frozenset([1]): 1,
frozenset([2]): 2,
frozenset([3]): 3,
frozenset([3, 4]): 34
}
def totree(d):
tree = {}
for key in d:
t = tree
for x in sorted(key):
t = t.setdefault(x, {})
t["value"] = d[key]
return tree
tree = totree(dictionary)
# {1: {'value': 1}, 2: {'value': 2}, 3: {'value': 3, 4: {'value': 34}}}
Now, you can recursively check those trees and yield
each key that has a value. Other than enumerating all the subsets, this will only expand those branches where all the elements so far are in the tree.
def check_subsets(tree, key, prefix=[]):
if "value" in tree:
yield prefix, tree["value"]
for i, x in enumerate(key):
if x in tree:
yield from check_subsets(tree[x], key[i+1:], prefix+[x])
biglist= [3, 4, 5]
res = list(check_subsets(tree, sorted(biglist)))
# [([3], 3), ([3, 4], 34)]
Note that it's important that both the keys in the tree and the key for lookup are added/checked in sorted order, otherwise relevant subtrees could be missed.
Addendum 1: This should be clear, but just to make sure: Of course this approach will not help if you construct the tree anew for each lookup, otherwise you can just as well do a linear scan of all the keys. Instead, you'd have to create the tree once, and then reuse it for multiple lookups, and possibly update it with new elements added to the set.
Addendum 2: Instead of "value"
, you can of course use any key for the actual value at that node in the prefix tree. You could use None
, or a very long string or large random number guaranteed not to be an element in any of your key-sets. You could, with a few adaptions to the totree
and check_subtree
functions, also define a custom Tree
class...
class Tree:
def __init__(self, value=None, children=None):
self.value = value
self.children = children or {}
def __repr__(self):
return "Tree(%r, %r)" % (self.value, self.children)
... but IMHO just using nested dictionaries with some special value keys is simpler.