I am trying to implement a kind of sorted dictionary as a Binary Search Tree. The idea is that no matter what operation I do on this new dictionary-like object, the elements of the dictionary are always sorted with respect to the keys. My implementation is not complete - there are some issues of performance that I would like to fix before completing the code.
The idea is to create two classes, Node
and SortedDict()
.
The class Node
has the __getitem__
and __setitem__
methods to insert and get elements in the tree (for the moment I am not implementing a delete operation).
The class SortedDict()
takes a sorted dictionary (for the moment I am not implementing cases where the argument is not a sorted dictionary), and it starts inserting the elements of the dictionary starting from the leaves of the tree. First, it processes the left child until it reaches the median of the keys of the dictionary. Then, it processes the right child, and then it glues the two children trees to the root. The root is the median of the sorted dictionary' keys.
If the dictionary is very large, I get a RecursionError: maximum recursion depth exceeded in comparison
error if I try to access an element that is very far from the root. Why do I get this error? How can I avoid it?
The traceback is:
Traceback (most recent call last):
File "main.py", line 96, in <module>
print(dic_obj[5])
File "main.py", line 91, in __getitem__
return self.root.__getitem__(item)
File "main.py", line 26, in __getitem__
return self.left.__getitem__(item)
File "main.py", line 26, in __getitem__
return self.left.__getitem__(item)
File "main.py", line 26, in __getitem__
return self.left.__getitem__(item)
[Previous line repeated 994 more times]
File "main.py", line 18, in __getitem__
if item > self.root:
RecursionError: maximum recursion depth exceeded in comparison
To reproduce this error you can use the following code:
dic = {k:k for k in range(1,10000)}
dic_obj = SortedDict(dic)
print(dic_obj[5])
where the definition of SortedDict
is given as follow:
class Node():
def __init__(self, root=None, value=None, left=None, right=None, parent=None):
self.parent = parent
self.root, self.value = root, value
self.left = left
self.right = right
def __str__(self):
return f'<Node Object - Root: {self.root}, Value: {self.value}, Parent: {self.parent}>'
def __getitem__(self, item):
if self.root is None:
raise KeyError(item)
if item > self.root:
if self.right is None:
raise KeyError(item)
return self.right.__getitem__(item)
if item < self.root:
if self.left is None:
raise KeyError(item)
return self.left.__getitem__(item)
if item == self.root:
return self.value
def __setitem__(self, key, value):
if self.root is None:
self.root, self.value = key, value
else:
if key > self.root:
if self.right is not None:
self.right.__setitem__(key,value)
else:
self.right = Node(root=key, value=value)
self.right.parent = self
elif key < self.root:
if self.left is not None:
self.left.__setitem__(key,value)
else:
self.left = Node(root=key, value=value)
self.left.parent = self
elif key == self.root:
self.root = value
class SortedDict():
def __init__(self, array: dict):
self.root = Node()
if array:
keys = list(array.keys())
for key in range(len(keys)//2):
self.__setitem__(keys[key],array[keys[key]])
root = Node(root=keys[key+1],value=array[keys[key+1]])
self.root.parent = root
root.left = self.root
self.root = Node()
for key in range(len(keys)//2+1,len(keys)):
self.__setitem__(keys[key],array[keys[key]])
self.root.parent = root
root.right = self.root
self.root = root
def __setitem__(self, key, value):
try:
if key > self.root.root:
if self.root.right is None and self.root.left is None:
node = Node(root=key, value=value)
self.root.parent = node
node.left = self.root
self.root = node
else:
if self.root.right is None:
self.root.right = Node(root=key, value=value, parent=self.root)
else:
node = Node(root=key, value=value)
self.root.parent = node
node.left = self.root
self.root = node
except:
self.root.root = key
self.root.value = value
def __getitem__(self, item):
return self.root.__getitem__(item)