1

I'm currently working on implementing a Fibonacci heap in Python for my own personal development. While writing up the object class for a circular, doubly linked-list, I ran into an issue that I wasn't sure of.

For fast membership testing of the linked-list (in order to perform operations like 'remove' and 'merge' faster), I was thinking of adding a hash-table (a python 'set' object) to my linked-list class. See my admittedly very imperfect code below for how I did this:

class Node:
    def __init__(self,value):
        self.value = value
        self.degree = 0
        self.p = None
        self.child = None
        self.mark = False
        self.next = self
        self.prev = self

    def __lt__(self,other):
        return self.value < other.value


class Linked_list:
    def __init__(self):
        self.root = None
        self.nNodes = 0
        self.members = set()

    def add_node(self,node):
        if self.root == None:
            self.root = node
        else:
            self.root.next.prev = node
            node.next = self.root.next
            self.root.next = node
            node.prev = self.root
            if node < self.root:
                self.root = node
        self.members.add(node)
        self.nNodes = len(self.members)

    def find_min():
        min = None
        for element in self.members:
            if min == None or element<min:
                min = element
        return min

    def remove_node(self,node):
        if node not in self.members:
            raise ValueError('node not in Linked List')
        node.prev.next, node.next.prev = node.next, node.prev
        self.members.remove(node)
        if self.root not in self.members:
            self.root = self.find_min()
        self.nNodes -=1

    def merge_linked_list(self,LL2):
        for element in self.members&LL2.members:
            self.remove_node(element)
        self.root.prev.next = LL2.root
        LL2.root.prev.next = self.root
        self.root.prev, LL2.root.prev = LL2.root.prev, self.root.prev
        if LL2.root < self.root:
            self.root = LL2.root
        self.members = self.members|LL2.members
        self.nNodes = len(self.members)

    def print_values(self):
        print self.root.value
        j = self.root.next
        while j is not self.root:
            print j.value
            j = j.next

My question is, does the hash table take up double the amount of space that just implementing the linked list without the hash table? When I look at the Node objects in the hash table, they seem to be in the exact same memory location that they are when just independent node objects. For example, if I create a node:

In:  n1 = Node(5)    
In:  print n1
Out: <__main__.Node instance at 0x1041aa320>

and then put this node in a set:

In:   s1 = set()
In:   s1.add(n1)
In:   print s1
Out:  <__main__.Node instance at 0x1041aa320>

which is the same memory location. So it seems like the set doesn't copy the node.

My question is, what is the space complexity for a linked list of size n with a hash-table that keeps track of elements. Is it n or 2n? Is there anything elementary wrong about using a hash table to keep track of elements.

I hope this isn't a duplicate. I tried searching for a post that answered this question, but didn't find anything satisfactory.

lstbl
  • 527
  • 5
  • 17
  • 1
    If you're planning on using this for a Fibonacci heap, are you sure you want that hash table? That will prevent you from quickly merging two heaps together, something you need to be able to do to implement most operations efficiently. – templatetypedef Jan 03 '16 at 04:35
  • Good point. I hadn't thought of that--I'll definitely remove it in my final implementation. However, I am still curious about this more generally. – lstbl Jan 03 '16 at 04:47

1 Answers1

1

Check In-memory size of a Python structure and How do I determine the size of an object in Python? for complete answers in determining size of objects

I have this small results on a 64 bits machine with python 3

>>> import sys
>>> sys.getsizeof (1)
28
>>> sys.getsizeof (set())
224
>>> sys.getsizeof (set(range(100)))
8416

The results are in bytes. This can give you a hint about how big sets are (they are pretty big).

My question is, what is the space complexity for a linked list of size n with a hash-table that keeps track of elements. Is it n or 2n? Is there anything elementary wrong about using a hash table to keep track of elements.

Complexity calculations never make a difference between n and 2n.Optimisation does. And it's commonly said "early optimisation is the root of all evil", to warn about potential optimisation pitfalls. So do as you think is best for supported operations.

Community
  • 1
  • 1
Diane M
  • 1,503
  • 1
  • 12
  • 23
  • Thanks for the reply! I was more curious of how Python is storing Nodes when you copy them to a set. Is it storing each individual element as a pointer to an individual node already created (not very costly), or copying the entire node over (more costly)? I think it is the former case, since when you put the nodes in the set they have the same memory location as they do upon creation. However, the set object seems to increase in size to the same degree as if had been created with new 'Node' objects. My guess is that 'sys.getsizeof' calculates the memory of the objects, not the pointers – lstbl Jan 03 '16 at 17:12
  • Nodes fields aren't copied when added to a set. The set only holds references, and `sys.getsizeof` will calculate the space used for them. This is why I said it is big, because it's about 80 bytes for a single reference. This inflation is due to the nature of the set implementation: hash tables are partially empty to be efficient in time complexity. – Diane M Jan 03 '16 at 19:03