0

I learned from this answer that the fastest method to insert an item at an index would be like this:

a = [1, 2, 3, 4, 5]
a[2:2] = ["12345"]

But I don't know why. And I learned from this answer that the complexity would never be O(1).

I tried the following and verified the statement that it should be larger than O(1).

python3 -m timeit --setup="a = list(range(1000))" "a[500:500] = [3]"
50000 loops, best of 5: 3.89 usec per loop

python3 -m timeit --setup="a = list(range(100000))" "a[500:500] = [3]"
10000 loops, best of 5: 20.1 usec per loop

python3 -m timeit --setup="a = list(range(1000000))" "a[500:500] = [3]"
1000 loops, best of 5: 340 usec per loop

I thought that we can pinpoint the address/pointer in O(1) and then we only need to direct that address to the new item, and it would be O(1). I thought I should be wrong since that would scoot over the addresses of the items on the right side.

I tried to see what is a[2:2] but it turns out to be just an empty list. I thought it would be possible to separate the indexing and the assignment. I mean if we can first get the pointer of a particular index and then make it point to a new item?

In [14]: a = [1, 2, 3, 4, 5]
In [15]: b = a[2:2]
In [16]: b = ["12345"]
In [17]: b
Out[17]: ['12345']
In [18]: a
Out[18]: [1, 2, 3, 4, 5]
In [19]: a[2:2] = ["12345"]
In [20]: a
Out[20]: [1, 2, '12345', 3, 4, 5]

In the code above I want to get the pointer by b=a[2:2] and then redirect that to a new item "12345" by b = ["12345"].

What's going on under the hood? Any suggestions would be very appreciated. Thanks in advance.

Lerner Zhang
  • 6,184
  • 2
  • 49
  • 66
  • What did you find when you tried to measure the time of an insert operation for different sizes of lists? – mkrieger1 Jul 29 '20 at 14:16
  • This answer gives explanations and links to further reading material about what's going on under the hood: [How is Python's List Implemented?](https://stackoverflow.com/questions/3917574/how-is-pythons-list-implemented) – mkrieger1 Jul 29 '20 at 14:16
  • That first question you link, I cannot reproduce the different timings. It reports 8x (!) slower for `list.insert` vs. slice assignment. This seems more than unrealistic (even for older versions of Python) and I obtain the same performance for Python 3.8. Whether or not this operation is O(1) depends on the implementation of the list type. For most implementations it won't be O(1) though. – a_guest Jul 29 '20 at 14:18
  • on average, the time complexity for insertion within a `list` is `O(1)` but within the *worst case* it can be complexity of `O(n)`. – de_classified Jul 29 '20 at 14:19
  • In the [answer](https://stackoverflow.com/a/27073665/) to one of the questions you have linked to it says that the insert operation has linear time complexity. Does that not answer your question? – mkrieger1 Jul 29 '20 at 14:19
  • @FishingCode Where did you find that O(1)? – mkrieger1 Jul 29 '20 at 14:20
  • @mkrieger1 https://stackoverflow.com/questions/60050808/does-insert-at-the-end-of-a-list-have-o1-time-complexity. – de_classified Jul 29 '20 at 14:21
  • 1
    @FishingCode Note that it says *at the end*. – mkrieger1 Jul 29 '20 at 14:23
  • @mkrieger1 You are right. I tried and find that the time increases as the list size increases. – Lerner Zhang Jul 29 '20 at 14:45

3 Answers3

1

This a[2:2] = [123] means that the empty list that was from the second index and upto the second index is being assigned to a non empty list which will henceforth fill that space.

There are other ways to insert elements in list. One of the most common and direct way is list.insert(index, element).

Divyessh
  • 2,540
  • 1
  • 7
  • 24
  • How can I get the pointer of that empty list? Can I do that in another way? Separating the indexing and the assigning? – Lerner Zhang Jul 29 '20 at 14:52
  • I think what I mean is why or how we can insert an item by filling an empty space? What is the empty space? Does it have an address? – Lerner Zhang Jul 29 '20 at 15:37
  • Can we fill an empty space by `[] = [3, 4]`? I don't think it is the same as `a[2:2] = [4]`, is it? – Lerner Zhang Jul 29 '20 at 15:39
  • you should think of it like this: `b = [], b = [3,4]`. here you are filling that empty space with another list right. this is what you are doing with `a[2:2] = [4]`, – Divyessh Jul 29 '20 at 19:00
0

This question requires some basic understanding of how python works. Lets start with assignment of value to a variable. In python the values are given a memory location and variables are just tags to that location. When you change the value of the variable the value in that memory location is not changed, rather a new memory is assigned to the new value and the tag not points to the new memory. When a memory has not tags pointing to it, it is garbage collected.E.g.

a = 5
b = a
b = 6
print(a, b)

this will give you output 5 6. a is still 5 and b is now 6.

Now let's talk about slicing operation in a list in python. The simplest definition is, ar[k:m] gives list starting from index k to m, m not including.

Hence ar[n:n] would be an empty list starting at n.(Python is awesome that way). Now when this statement ar[k:m] is on the left side of an assignment this is reference to the list in that range. E.g.

ar = [0,1,2,3,4,5,6]
ar[1:2] = [13]

this will take the list starting at index 1 till 2 not including index 2, i.e. [1] and replace it with a list [13]. Thus, ar now becomes [0,13,2,3,4,5,6]

now if we do

ar[2:5] = [3]

then ar becomes [0,13,3,5,6]

Now comes the funs part. ar[2:2] is a list starting at index 2 and having no element thus an assignment will not change any existing entry, but rather create a list starting at index 2:

ar[2:2] = [1,2,3]

ar becomes [0,13,1,2,3,3,5,6]

Now, a bit confusing part. When the slicing operating is at the right side of assignment operator then a copy of the list is created and assigned to the variable on the left.

b = ar[2:4]

b becomes [1,2]

since b is a copy of ar any change you do to b will not effect ar.

0

Beware: the first referenced question was about producing a modified version of a list without changing the original.

Using insert or slice notation for adding one single element is more or less the same thing. Slice notation is better is you insert a sublist, and the longer the sublist, the more difference.

As under the hood a Python list is a dynamic sized array (what C++ calls a vector, or Java an ArrayList), inserting something in the list requires:

  • optionnaly allocate a new array and copy the original there if the reserved (and unused) size is not enough for the additional content
  • unless you add at the end, move everything starting at the insertion index to the end of the array
  • put the new content in the free place

That is the reason why:

  • adding an element to the end of the array is O(1)
  • inserting an element at the beginning of the array is O(n) - inserting it at a random position averages to n/2 which is still O(n)
  • inserting m elements in an array with one single slice operation is O(n)
  • inserting m elements in an array with m insert operations is O(n*m)

Any of those operations can get a significant overhead if it requires a new allocation.

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252