Speed up for loop iteration

Question

Can I somehow speed up this simple code in python? That takes 0.221seconds for executing

for some_list in some_list_of_lists:
    i = 1
    while i < len(some_list):
        a = some_list[i-1]
        b = some_list[i]
        
        i += 2

some_list_of_lists ~110 000 items

some_list ~10 items

Same code in cython takes only 0.004seconds, my goal is achieve 0.02seconds or less in python(without using cython)

full code:

result = []
for some_list in some_list_of_lists:
    t1 = 1
    t2 = 1
    
    i = 1
    while i < len(some_list):
        a = some_list[i-1]  #int
        b = some_list[i]    #int
        
        t1 *= a
        t2 *= b
        
        i += 2 
    if t1 > t2:
        result.append(some_list)

https://wiki.python.org/moin/PythonSpeed/PerformanceTips ---- look at that link for a bunch of tips on code efficiency and speed — CR130, Sep 16 '22 at 22:45
This is a silly question - your code doesn't even do anything. It just nests two pointless loops, that read values pairwise from lists from a list, only to do nothing with them. Depending on what operation you need, there may be faster solutions - the fastest way to optimise the code you shared is to comment it out, and replace with `a, b = some_list_of_lists[-1][-2:]` (maybe some simple extra logic for odd inner length of the last list, or maybe you were after `i` as well). Please explain a bit more about the actual problem. — Grismar, Sep 16 '22 at 22:55
@Grismar I think this is just the skeleton of the loop, and it actually uses `a` and `b` within the loop. But that part isn't relevant. — Barmar, Sep 16 '22 at 23:16
@barmar, I think so too, but if performance is key, it's how `a` and `b` would be used / what they are needed for that would determine what the best performance improvement would be. — Grismar, Sep 16 '22 at 23:17
@Grismar I add more code in question, now you can see for what **a** and **b** used — 40k-btc, Sep 16 '22 at 23:27
If the sublists are big, then using Numpy vectorized functions should be faster (especially if all the sublists are of the same size. The thing is list direct indexing is a pretty slow. If you want speed, then using a Python interpreter is certainly not a good idea. Consider using PyPy (or Pyston) which is design to be faster for such a code. Why Cython does fit your needs? It seems a good idea. — Jérôme Richard, Sep 17 '22 at 01:05
@JérômeRichard yea, cython doing great, I just dont find easy method for debugging cython code in VS code — 40k-btc, Sep 17 '22 at 01:12
You don't want to use cython, which is fine, but you can use common packages like numpy, yes? — Grismar, Sep 17 '22 at 11:03
Ok, this is another question then. IDK much for VS code but Cython is mainly based on the GNU toolchain (while is expect VScode to be link to the MS or LLVM toolchain due to the licenses). I would personally use GDB which should work well. Note that [there are ways to use GDB directly in VScode](https://medium.com/@pamirghimire/debugging-with-gdb-on-windows-using-visual-studio-code-81ba70b562f3). This require the code to be compiled in debug (no optimization and `-g` option). You also need to put a breakpoint before the execution of the function. — Jérôme Richard, Sep 17 '22 at 11:34
Note that if you create this data structure once and then iterate over it many times in Cython, then using a list of Numpy array can be significantly faster (while the creation could be slower). If the size does not change at runtime, you can even build one big array and make the code even much faster than the current one. Meanwhile, the pure-CPython can barely be made faster. — Jérôme Richard, Sep 17 '22 at 11:44
@Grismar yes. I try numpy array but its slower than list. https://stackoverflow.com/questions/35232406/why-is-a-for-over-a-python-list-faster-than-over-a-numpy-array — 40k-btc, Sep 17 '22 at 13:07
"I try numpy array but its slower than list." - in most cases, this just means that you're not using `numpy` correctly - see my answer below for an example of using `numpy` that's way faster. — Grismar, Sep 18 '22 at 01:21

Grismar · Accepted Answer · 2022-09-18T01:18:11.820

The main problem here is that your implementation is very inefficient.

Compare:

from timeit import timeit
from random import random
import numpy as np  # for a numpy solution
from math import prod  # for a better Python solution

# just some random example data that is "some list of lists" (of numbers)
example_data = [
    [random() * 10 for _ in range(100)]
    for _ in range(100)
]


def f(some_list_of_lists):
    # your code, in a function, so it's easy to time
    result = []
    for some_list in some_list_of_lists:
        t1 = 1
        t2 = 1

        i = 1
        while i < len(some_list):
            a = some_list[i - 1]  # int
            b = some_list[i]  # int

            t1 *= a
            t2 *= b

            i += 2
        if t1 > t2:
            result.append(some_list)
    return result


# the same data again, but in a numpy array
example_array = np.array(example_data)


def numpy_f(some_array):
    # this function does exactly the same, it selects those 'lists' for which
    # the product of the even elements is greater than the product of the odd ones
    return some_array[
        np.prod(some_array[:,::2], axis=1) > np.prod(some_array[:,1::2], axis=1)
    ]


def better_python_f(some_list_of_lists):
    return [
        xs for xs in some_list_of_lists if prod(xs[::2]) > prod(xs[1::2])
    ]


# showing that the results are the same
simple_example = [[2, 1, 1, 1], [1, 2, 1, 1], [2, 1, .5, 1], [.5, 1, 1, 1], [1, .5, 1, 1]]
print(f(simple_example))
print(numpy_f(np.array(simple_example)))
print(better_python_f(simple_example))

# timing running each 10,000 times on the large example data set
print('yours:', timeit(lambda: f(example_data), number=10000))
print('numpy:', timeit(lambda: numpy_f(example_array), number=10000))
print('mine :', timeit(lambda: better_python_f(example_data), number=10000))

Output:

[[2, 1, 1, 1], [1, 0.5, 1, 1]]
[[2.  1.  1.  1. ]
 [1.  0.5 1.  1. ]]
[[2, 1, 1, 1], [1, 0.5, 1, 1]]
yours: 5.2597994999996445
numpy: 0.1987909999988915
mine : 0.7029579999998532

This shows that using just plain Python, without external libraries, you can easily get almost an order of magnitude faster. I'm sure someone can come up with an even faster method just in Python.

However, numpy is perfect for this, and even the trivial solution I wrote is more than 25x faster than yours - and again I think someone more experienced in the application of numpy can probably write an even faster solution using numpy.

The key takeaway here shouldn't be "I need to use numpy because it's fast" though. The key thing is that you should think about what problem you're actually solving and then decide what the most efficient way is to do exactly that.

In your case, what the code actually does is "for a series of number series, decide if the product of numbers on even indices is greater than the product of numbers on odd incides for each number series, and keep only those where this is the case". To me, that quickly points at thinking about slicing, efficient ways to compute a product, and how to select and return the desired result.

Not only does the better_python_f do exactly what it needs to, the code almost reads like a description of the problem:

[
    xs for xs in some_list_of_lists if prod(xs[::2]) > prod(xs[1::2])
]

"A list of all xs in the original list, where the product of even elements is greater than the product of odd elements."

Of course, it does require that you know about the Python concepts 'list comprehension' (x for x in .. if <condition on x>) and 'slicing' (e.g. xs[::2]), and that math has a fairly efficient prod() function to compute the product of elements of an iterable like a list.

score 0 · Answer 2 · answered Sep 16 '22 at 23:19

0

Threading and multiprocessing are both modules used to speed up functionality and get more done in less time. I'm not sure how much improvement you'll see in this example but these modules are great to know about in general.

Multiprocessing vs Threading Python

answered Sep 16 '22 at 23:19

Jackie

198
12

Pickling will certainly make the code using multiprocessing slower than the serial one and the multithreading package is useless here because of the GIL. Thus, it is not very useful. – Jérôme Richard Sep 17 '22 at 00:50

Speed up for loop iteration

2 Answers2