I've created a linked list in Cython and so far it seems to perform good, but for my specific application, any performance gain is welcome.
The linked is being used be used to construct a binary BVH tree using the following algorithm (pyseudocode):
bvh_root = bvh_node()
bvh_root.subnodes = linked_list(all_leaf_nodes)
list = linked_list()
list.push_front(bvh_root)
while not list.length == 0:
current_node = list.pop_back()
if current_node.subnodes <= 2:
continue
center_a, axis = current_node.calculate_longest_axis_center()
node_a = bvh_node()
node_b = bvh_node()
while not current_node.subnodes.length == 0:
subnode = current_node.subnodes.pop_back()
center_b = subnode.calculate_axis_center(axis)
if center_b < center_a:
node_a.subnodes.push_front(subnode)
else:
node_b.subnodes.push_front(subnode)
if node_a.subnodes.length > 0:
current_node.subnodes.push_front(node_a)
if node_b.subnodes.length > 0:
current_node.subnodes.push_front(node_b)
list.push_front(node_a)
list.push_front(node_b)
I was reading through common bottlenecks in C code and one of them caught my attention, it seems like malloc() is a rather slow function since it has to interact with the kernel and OS. It turns out, I call malloc() each time I insert a element on the list, and free each time I pop a element.
Only showing those two functions that matter more since the code is very big:
cdef void lkdlist_push(LinkedList* lst, void* data) nogil:
cdef ListNode* node = <ListNode*> malloc(sizeof(ListNode))
node.next = NULL
node.data = data
if lst.tail == NULL:
lst.head = node
lst.tail = node
else:
lst.tail.next = node
lst.tail = node
lst.length += 1
cdef void* lkdlist_pop(LinkedList* lst) nogil:
cdef ListNode* node
cdef void* data
if lst.length <= 0:
return NULL
elif lst.length == 1:
node = lst.head
lst.head = NULL
lst.tail = NULL
else:
node = lst.head
lst.head = node.next
data = node.data
free(node)
lst.length -= 1
return data
I was wondering if instead of freeing a node, I just stored it somewhere to use it when I need to push an element could speedup a linked list.
Or all the logistics needed to store and recover unused nodes would outslow malloc?