1

I'm working on Problem Solving with Algorithms and Data Structures and come across this question: Design and implement an experiment that will compare the performance of a Python list with a list implemented as a linked list.

Below is my linked list implementation.

class Node(object):

    def __init__(self, data):
        self.data = data
        self.next = None

    def get_data(self):
        return self.data

    def get_next(self):
        return self.next

    def set_data(self, new_data):
        self.data = new_data

    def set_next(self, new_next):
        self.next = new_next


class UnOrderedList(object):
    def __init__(self):
        self.N = 0
        self.head = None

    def size(self):
        return self.N

    def is_empty(self):
        return self.N == 0

    def add(self, data):
        self.N += 1
        temp = Node(data)
        temp.set_next(self.head)
        self.head = temp

    def search(self, data):
        current = self.head
        found = False
        while current and not found:
            if current.get_data() == data:
                found = True
            current = current.get_next()
        return found

    def remove(self, item):
        current = self.head
        previous = None
        while current.get_data() != item:
            previous = current
            current = current.get_next()
        if not previous:
            self.head = current.get_next()
        else:
            previous.set_next(current.get_next())
        self.N -= 1

test remove method:

for i in range(1000, 100000001, 100000):
    list1 = list(range(i))
    list2 = UnOrderedList()
    for a in range(i):
        list2.add(a)

    a = random.randrange(0, i)

    start_time1 = time.time()
    list1.remove(a)
    end_time1 = time.time()

    start_time2 = time.time()
    list2.remove(a)
    end_time2 = time.time()

    print("List time: {0}. Linked List time: {1}".format(end_time1-start_time1, end_time2-start_time2))

For my test, I test linked list's methods with python list's similar methods and linked list always comes up short. So I read a bit on the internet and find that though python list is better in index/search, linked list is supposed to trump it in add or remove.

So my question again is, is linked list always slower than list or what I am doing wrong?

Manh Nguyen Huu
  • 331
  • 1
  • 4
  • 12
  • "Python-style lists (which are flexible arrays, not linked lists)" - Guido van Rossum –  Feb 03 '17 at 08:50
  • 1
    Linked lists are best in *random* add and removal - if you append and remove from the back a plain amortized-growth array (which is what python list is) is going to be faster. Also, linked lists are better in those operations in terms of asymptotic complexity, which doesn't consider constants. With a small number of elements an array is still going to win (it doesn't have to allocate a new node for each element and has better locality). Finally, you are comparing the builtin list - written in C - with a linked list written in Python; CPython alone is going to give you a ~20x slowdown. – Matteo Italia Feb 03 '17 at 08:52
  • BTW, it would be better to name your `size` method `__len__`. That way you can call the built-in `len` function on your `UnOrderedList` instances to get their current size. – PM 2Ring Feb 03 '17 at 08:59
  • @StefanPochmann I just did. – Manh Nguyen Huu Feb 03 '17 at 09:10

5 Answers5

3

Other answers have skipped a very important detail, Linked list will outperform arrays in say remove() method only when the pointer to node to be deleted is provided as parameter, not the value of node.

Otherwise, you will have to search through list, which has same O(n) complexity as removing element from an indexed-based list.

But there is another slightly less important factor at play here. Python list is actually implemented in C. A pure Python program is unlikely to beat a C counterpart, specially when that is written and optimized by experts for many years.

Shihab Shahriar Khan
  • 4,930
  • 1
  • 18
  • 26
  • Not true, linked list can also outperform array in `remove` when given a value, see second part of my answer. – Stefan Pochmann Feb 03 '17 at 10:04
  • @StefanPochmann, it "can", yes of course. But is complexity any better than O(n)? – Shihab Shahriar Khan Feb 03 '17 at 10:11
  • No, complexity isn't better than O(n) (if n is all you're looking at... you could also say O(index(value)). Btw, "pointer to node to be deleted" won't work, you need the *previous* node (unless you use a value-moving hack). – Stefan Pochmann Feb 03 '17 at 10:18
  • @StefanPochmann Correction 1) Mistake, `list2.add(a)` adds to the start. Then `i-1` removes from completely opposite ends of two lists. Linked list is faster here, because it simply deletes the 2nd element every time, while array list has to search till the end. 2) In reality, on equal arrays (do `list1.reverse()`), array list is faster, while linked list gets closer (and faster later) only within 1% length to the list start. But if you need ~start deletion so much, build array list in reverse, delete from end instead (be faster than linked one). So linked list is quite useless in practice. – Artem Novikov Nov 26 '21 at 13:10
1

Python lists are implemented from arrays. So you are comparing Linked Lists to Arrays.

In Linked list you can insert/delete elements easily , but it takes more time in Arrays to move the other elements once the element is inserted/deleted.

Please refer comparison between array and linkedlist for more details. Also this quora question explains the implementation of list in python.

mc20
  • 1,145
  • 1
  • 10
  • 26
  • when you say "move the elements in arrays" do you mean how the entire list shifts when an element in the middle is removed or add? – Manh Nguyen Huu Feb 03 '17 at 08:53
  • 2
    @ManhNguyenHuu That's correct. The built-in `list` can perform that shifting operation rather quickly because it happens at C speed, which is roughly 20 times faster than a Python loop, as Matteo Italia mentioned in the comment on your question. It's still technically O(n), though. – PM 2Ring Feb 03 '17 at 08:57
1

Well your test doesn't test something where a linked list would have an advantage. Here's a test where it does:

>>> from timeit import timeit

>>> linked_list = UnOrderedList()
>>> timeit(lambda: linked_list.add(0), number=10**5)
0.08297442221169149

>>> python_list = []
>>> timeit(lambda: python_list.insert(0, 0), number=10**5)
1.5988611595369093

Or you could just use your own test but without setting a to a random value (i.e., keep it at i-1):

List time: 0.0. Linked List time: 0.0
List time: 0.00100016593933. Linked List time: 0.0
List time: 0.00200009346008. Linked List time: 0.0
List time: 0.00300002098083. Linked List time: 0.0
List time: 0.00300002098083. Linked List time: 0.0
List time: 0.00399994850159. Linked List time: 0.0
List time: 0.00499987602234. Linked List time: 0.0
List time: 0.00699996948242. Linked List time: 0.0
List time: 0.00699996948242. Linked List time: 0.0
List time: 0.00799989700317. Linked List time: 0.0
List time: 0.00999999046326. Linked List time: 0.0
List time: 0.00899982452393. Linked List time: 0.0
...
Stefan Pochmann
  • 27,593
  • 8
  • 44
  • 107
  • I thought linked list's add method is comparable to python list's append method? In your example, one method has O(1) and the other has O(n) complexity. – Manh Nguyen Huu Feb 03 '17 at 09:29
  • 1
    @ManhNguyenHuu No, your linked list adds to the front of the list. So the corresponding functionality is `insert(0, ...)`. Not `append`. And yes, that being O(1) vs O(n) is exactly the point. – Stefan Pochmann Feb 03 '17 at 09:33
  • @ManhNguyenHuu I added another test that also shows a linked list advantage, it's just a small variation of your own test. – Stefan Pochmann Feb 03 '17 at 10:01
0

Python's lists are based on C arrays. Arrays have the advantage over linked lists during iteration and random access, while linked lists have the advantage in random insertions and deletions.

This goes into detail of the theoretical differences betweent the two.

jesper_bk
  • 503
  • 2
  • 9
0

Think it deserves a dedicated answer, in terms of removal by value.

First, fix your test

In your example, arrays are different. Your linked list is in reversed order (100, 99, ...), because list2.add(a) adds to the start.

Quick fix is to reverse Python range, when populating linked list: range(i-1,-1,-1). This way elements will be in the same order as Python list.

Test

Now, test removals from list end to start (i-1, i-2, ...). Array list is faster (Python list) most of the time, while linked list gets closer (and faster further) only within 1% length to the list start (a = round(i / 99)).

So even though Python list needs to shift all the items following the deleted item, it's still faster.

That's due to internal Python implementation optimizations. As well as memory optimizations that are possible due to array list nature - because elements are sequential in memory, you get all the benefits out of data locality.

So yes, removals around list start (very close to start) are faster. But then, if you need such removals, just build Python array in the reversed order at first, and then delete from end, instead. It will be faster (as you observed above).

P.S. There are probably Python libraries with better Linked List implementation, but still, most of the time array list is faster.

Artem Novikov
  • 4,087
  • 1
  • 27
  • 33