Why is slicing much faster than "manual" assignment?

Question

I was working on an array rotation algorithm, where you rotate an array left by d steps of rotation.

I figured that slicing was a high level abstraction that would be much slower in reality than manually assigning each value in the array to a new location in the rotated array.

It turns out slicing is almost 40 times faster. Why is that?

Here is the code for comparison:

def rot_left_manual(a, d):
    a_length = len(a)
    rot_arr = [0] * a_length
    for i in range(a_length):
        rot_arr[(i-d) % a_length] = a[i]
    return rot_arr


def rot_left_slice(a, d):
    i = d % len(a)
    b = a[i:]
    b += (a[:i])
    return b

I'm using %timeit in Jupyter notebooks to time function speed

"I figured that slicing was a high level abstraction that would be much slower in reality than manually assigning each value in the array to a new location in the rotated array." Because in CPython this abstraction is implemented in C as optimized, compiled code that works with the internals of the object. Your implementation is doing most of the work at the interpreter level. — juanpa.arrivillaga, Oct 16 '20 at 21:29
@juanpa.arrivillaga, that makes sense, but I don't understand how one language use another language in the same code. — jeremy radcliff, Oct 16 '20 at 21:34
See also: [**Efficient way to rotate a list in python**](https://stackoverflow.com/questions/2150108/efficient-way-to-rotate-a-list-in-python) — Christopher Peisert, Oct 16 '20 at 21:41
@jeremyradcliff The underlying hardware of a computer only understands machine code. Python runs at different speeds depending on how the code is translated from the high-level language to low-level machine language. For example, PyPy uses a just-in-time compiler whereas the standard implementation, CPython, uses an interpreter written in C. Since compilation results in faster code than dynamic interpretation, using PyPy is generally faster. It is a common practice to optimize modules by using a lower-level language. — Christopher Peisert, Oct 16 '20 at 21:48
@ChristopherPeisert, thanks for explaining, I have a lot to learn. — jeremy radcliff, Oct 16 '20 at 21:56

score 1 · Accepted Answer · answered Oct 16 '20 at 21:40

Python's binary operations are relatively expensive. Looking at rot_arr[(i-d) % a_length] = a[i] for every execution of the loop

load rot_arr, i, d, a_length and a
call i.__sub__() and create intermediate object
call intermed.__mod__() and create intermediate object
call rot_arr.__setitem__, decrementing and potentially freeing existing obj

With slicing, almost all of the work is done in the list's slice method (implemented in C) which can optimize most of the moves with many fewer calculations and avoiding the expense of looking up or creating all of those python objects.

Why is slicing much faster than "manual" assignment?

1 Answers1