0

I have a working code like this, but it is rather slow.

def halfconvolution(g,w,dz):
    convo=np.zeros_like(g)
    for i in range(0,len(g)):
        sum=0
        for j in range(0,i):
            sum+=g[j]*w[(i-j)]*dz
        convo[i] = -sum
    return convo

I am trying to turn it into a list comprehension, but I am struggling.
I tried:

convo=[-g*w[i-j] for i in g for j in w]
Phantômaxx
  • 37,901
  • 21
  • 84
  • 115
StarStrides
  • 121
  • 3
  • 10
  • 2
    Why are you under the impression that a list comprehension will "be faster"? Don't obfuscate a loop for sake of reducing lines of code – OneCricketeer Mar 25 '18 at 12:27
  • You should be using `for i in range(len(g))` and `for j in range(i)` in the comprehension, same as you do in the loop. Don't forget to measure and see if you really get better performance. – orip Mar 25 '18 at 12:31
  • @cricket_007 I for some reason thought list comprehensions were faster.. I timed it, and indeed it is slower. Any recommendations on speeding up such a code? – StarStrides Mar 25 '18 at 12:58
  • 1
    You are using Numpy, but doing loops to return an array... You ought to use *vectorized operations* over entire columns at once rather than indexing the column and updating individual cells – OneCricketeer Mar 25 '18 at 19:17

3 Answers3

3

I am not sure if this improves the performance, but it is a list comprehension as you asked

convo = [-sum(g[j] * w[i - j] * dz for j in range(0, i)) for i in range(0, len(g))]

A faster implementation using NumPy:

# make the matrices square
g = np.repeat(g, g.shape[0]).reshape(g.shape[0], g.shape[0], order='F')
w = np.repeat(w, w.shape[0]).reshape(w.shape[0], w.shape[0], order='F')

# take the lower half of g
g = np.tril(g, k=-1)

# shift each column by its index number
# see: https://stackoverflow.com/questions/20360675/roll-rows-of-a-matrix-independently
rows_w, column_indices_w = np.ogrid[:w.shape[0], :w.shape[1]]
shift = np.arange(w.shape[0])
shift[shift < 0] += w.shape[1]
w = w[rows_w, column_indices_w - shift[:,np.newaxis]].T

convo = np.sum(g * w, axis=1) * dz

For it to work it needs both w and g to be of the same size, but otherwise I'm sure a workaround can be found.

I hope this is a more acceptable speedup for you? Always try to rewrite your logic/problem into vector/matrix multiplications.

Maarten-vd-Sande
  • 3,413
  • 10
  • 27
2

The inner loop can be replaced by the sum function (don't override it with a variable of the same name)

Then you append the outer loop to the end of that

 [-sum(g[j]*w[i-j]*dz for j in range(i)) for i in range(len(g))]
OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
1

Don't use list comprehensions for performance reasons

Use

  • Numba
  • Cython
  • Vectorized Numpy operations

Numba

import numba as nb
import numpy as np
import time

@nb.njit(fastmath=True)
def halfconvolution(g,w,dz):
    convo=np.empty(g.shape[0],dtype=g.dtype)
    for i in range(g.shape[0]):
        sum=0.
        for j in range(0,i):
            sum+=g[j]*w[(i-j)]*dz
        convo[i] = -sum
    return convo

g=np.random.rand(1000)
w=np.random.rand(1000)
dz=0.15

t1=time.time()
for i in range(1000):
  #res=halfconvolution(g,w,dz)
  res=[-sum(g[j]*w[i-j]*dz for j in range(i)) for i in range(len(g))]

print(time.time()-t1)
print("Done")

Performance

List Comprehension: 0.27s per iteration 
Numba Version: 0.6ms per iteration

So there is a factor 500 between this two versions. If you wan't to call this function on multiple arrays at once, you can also parallelize this problem easily and you should get at least another "Number of Cores" speed up.

max9111
  • 6,272
  • 1
  • 16
  • 33