1

I'm trying to improve the speed of some python code a bit and therefore trying to move a standard for loop to either a list comprehension or map call:

    buf = [0 for i in range(self.numLEDs * 3)]
    temp = [0,0,0]
    for x in range(self.numLEDs):
        r = data[x*3]
        g = data[x*3+1]
        b = data[x*3+2]
        temp[self.c_order[0]] = self.gamma[r]
        temp[self.c_order[1]] = self.gamma[g]
        temp[self.c_order[2]] = self.gamma[b]

        buf[x * 3:x * 3 + 3] = temp

c_order is simply another list, in this case [1,2,0]. It controls the channel order for some RGB pixels. gamma is a list 256 elements long that holds gamma corrected values for each of the 8bit channel values.

What'd I'd like to do is somehow completely remove any use of a standard for loop from this bit of code. I've managed to do it without the channel swap, but with the gamma correction and it's twice as fast. Like this:

corrected = [gamma[i] for i in data]
buf[0:len(corrected)] = corrected

How can I swap the order of list elements as I go without a for loop though?

Adam Haile
  • 30,705
  • 58
  • 191
  • 286
  • 2
    Why do you think that using a list comprehension would speed up your code? I think using `map` would actually hurt your performance quite a bit (since you'd have to add some lambdas). List comprehensions are just syntactic sugar for a for loop that builds a new list, so any speedup you can get with a comprehension should also be possible with this version (using a plain for loop). – DaoWen Jun 11 '14 at 04:23
  • 1
    I'm having a hard time following what you're going for here -- Specifically, it's hard to tell how the "speed-up" code relates to the original code. – mgilson Jun 11 '14 at 04:23
  • i don't think list comprehension make sense here because `buf` is not processed in order – Fabricator Jun 11 '14 at 04:24
  • also you are referencing the same `temp` the entire time, wouldn't every element in `buf` get the same value? – Fabricator Jun 11 '14 at 04:26
  • As referenced here (https://wiki.python.org/moin/PythonSpeed/PerformanceTips#Loops) a list comprehension often is faster than a for loop as it is run in C code (or something). And I can confirm that it IS twice as fast. – Adam Haile Jun 11 '14 at 04:42

3 Answers3

3

You can have everything done in numpy in a few lines and slightly faster:

In [69]:

gamma=list(np.random.rand(256))
numLEDs=10
data=list(np.random.randint(0,256,30))
c_order=[0,1,2]
In [70]:

%%timeit 
buf = [0 for i in range(numLEDs * 3)]
temp = [0,0,0]
for x in range(numLEDs):
    r = data[x*3]
    g = data[x*3+1]
    b = data[x*3+2]
    temp[c_order[0]] = gamma[r]
    temp[c_order[1]] = gamma[g]
    temp[c_order[2]] = gamma[b]
    buf[x * 3:x * 3 + 3] = temp
10000 loops, best of 3: 47.3 µs per loop
In [85]:

gamma=np.array(gamma)
data=np.array(data)

In [86]:

%%timeit
data_array=data.reshape(3, -1, order='F')
np.take(gamma[data_array], c_order, axis=0).ravel(order='F')
10000 loops, best of 3: 38.3 µs per loop

When you have a lot of LED's, the numpy version will be much faster than the loop version:

In [98]:

gamma=list(np.random.rand(256))
numLEDs=1000
data=list(np.random.randint(0,256,3000))
c_order=[0,1,2]
In [99]:

%%timeit 
buf = [0 for i in range(numLEDs * 3)]
temp = [0,0,0]
for x in range(numLEDs):
    r = data[x*3]
    g = data[x*3+1]
    b = data[x*3+2]
    temp[c_order[0]] = gamma[r]
    temp[c_order[1]] = gamma[g]
    temp[c_order[2]] = gamma[b]
    buf[x * 3:x * 3 + 3] = temp
100 loops, best of 3: 4.08 ms per loop
In [100]:

gamma=np.array(gamma)
data=np.array(data)

In [101]:

%%timeit
data_array=data.reshape(3, -1, order='F')
np.take(gamma[data_array], c_order, axis=0).ravel(order='F')
1000 loops, best of 3: 244 µs per loop
CT Zhu
  • 52,648
  • 17
  • 120
  • 133
  • Sadly, I'd prefer to avoid any library dependencies. My library so far is very self contained and I'm ok with a little extra runtime if it means keeping it that way. – Adam Haile Jun 11 '14 at 17:03
1

So you need pure python code without any extension library.

To speedup the code:

  1. use local variable in loops.
  2. change for loop to list comprehension.

Here is the code:

class Test(object):

    def __init__(self, n):
        self.numLEDs =  n
        self.c_order = [1, 2, 0]
        self.gamma = [i // 2 for i in range(256)]

    def do1(self, data):
        buf = [0 for i in range(self.numLEDs * 3)]
        temp = [0,0,0]
        for x in range(self.numLEDs):
            r = data[x*3]
            g = data[x*3+1]
            b = data[x*3+2]
            temp[self.c_order[0]] = self.gamma[r]
            temp[self.c_order[1]] = self.gamma[g]
            temp[self.c_order[2]] = self.gamma[b]

            buf[x * 3:x * 3 + 3] = temp
        return buf

    def do2(self, data):
        buf = [0] * (self.numLEDs * 3)
        gamma = self.gamma
        for idx, idx2 in enumerate(self.c_order):
            buf[idx2::3] = [gamma[v] for v in data[idx::3]]
        return buf

import random
random.seed(0)
N = 1000
t = Test(N)
data = [random.randint(0, 255) for i in range(3*N)]
r1 = t.do1(data)
r2 = t.do2(data)
print r1 == r2  # check the result

%timeit t.do1(data)
%timeit t.do2(data)

the output, it's 6x faster:

True
1000 loops, best of 3: 1.1 ms per loop
10000 loops, best of 3: 176 µs per loop
HYRY
  • 94,853
  • 25
  • 187
  • 187
  • PERFECT. That did exactly what I wanted. And now I know about enumerate. That is awesome and very much what I was looking for but had no idea where to look... and the extended slice trick is brilliant :) Thanks! – Adam Haile Jun 12 '14 at 04:02
0

Contrary to popular belief, calling a map function will not give you significant speedup. You may actually see worse performance.

Depending on how long you spend in this section of code, this may be the perfect situation where simply porting this loop to C makes sense. See here.

Make sure that you're actually spending a lot of time in this for-loop, otherwise the overhead of calling your C code will outweigh any potential performance gains.

Read here for some potential alternatives if you decide to use to port this code to C:

  1. ctypes vs C extension
  2. Wrapping a C library in Python: C, Cython or ctypes?
Community
  • 1
  • 1
Martin Konecny
  • 57,827
  • 19
  • 139
  • 159