DFT in Python taking significantly longer than C

Question

I'm currently working on translating some C code To Python. This code is being used to help identify errors arising from the CLEAN algorithm used in Radio Astronomy. In order to do this analysis the value of the Fourier Transforms of Intensity Maps, Q Stokes Map and U Stokes Map must be found at specific pixel values (given by ANT_pix). These Maps are just 257*257 arrays.

The below code takes a few seconds to run with C but takes hours to run with Python. I'm pretty sure that it is terribly optimized as my knowledge of Python is quite poor.

Thanks for any help you can give.

Update My question is if there is a better way to implement the loops in Python which will speed things up. I've read quite a few answer here for other questions on Python which recommend avoiding nested for loops in Python if possible and I'm just wondering if anyone knows a good way of implementing something like the Python code below without the loops or with better optimised loops. I realise this may be a tall order though!

I've been using the FFT up till now but my supervisor wants to see what sort of difference the DFT will make. This is because the Antenna position will not, in general, occur at exact pixels values. Using FFT requires round to the closest pixel value.

I'm using Python as CASA, the computer program used to reduce Radio Astronomy datasets is written in python and implementing Python scripts in it is far far easier than C.

Original Code

def DFT_Vis(ANT_Pix="",IMap="",QMap="",UMap="", NMap="", Nvis=""):

UV=numpy.zeros([Nvis,6])
Offset=(NMap+1)/2
ANT=ANT_Pix+Offset;

i=0
l=0
k=0
SumI=0
SumRL=0
SumLR=0


z=0

RL=QMap+1j*UMap
LR=QMap-1j*UMap

Factor=[math.e**(-2j*math.pi*z/NMap) for z in range(NMap)]

for i in range(Nvis):
    X=ANT[i,0]
    Y=ANT[i,1]

    for l in range(NMap):

        for k in range(NMap):

            Temp=Factor[int((X*l)%NMap)]*Factor[int((Y*k)%NMap)];

            SumI+=IMap[l,k]*Temp
            SumRL+=RL[l,k]*Temp
            SumLR+=IMap[l,k]*Temp               


        k=1

    UV[i,0]=SumI.real
    UV[i,1]=SumI.imag
    UV[i,2]=SumRL.real
    UV[i,3]=SumRL.imag
    UV[i,4]=SumLR.real
    UV[i,5]=SumLR.imag
    l=1
    k=1
    SumI=0
    SumRL=0
    SumLR=0

return(UV)

Is hours an exaggeration or does this truly take hours to run? — Richard J. Ross III, Apr 08 '12 at 14:54
"The below code takes a few seconds to run with C". Er, the code is Python code. We can't see the C code. But it should hardly be surprising that hand rolled pure Python DFT code is likely to be slow. Why do you think you should be re-implementing this code rather than using one of the known libraries that do it well? — David Heffernan, Apr 08 '12 at 15:07
Unfortunately it is literally taking hours. Pretty much the same code but C formated only takes seconds. I've updated the question with further information. — Eoin Murphy, Apr 08 '12 at 15:32
What values do you pass to this function? Is a single call taking hours or is it being called in a loop? — TryPyPy, Apr 08 '12 at 20:14

score 3 · Answer 1 · answered Apr 08 '12 at 15:03

3

You should probably use numpy's fourier transform code, rather than writing your own: http://docs.scipy.org/doc/numpy/reference/routines.fft.html

answered Apr 08 '12 at 15:03

Marcin

48,559
18
128
201

score 1 · Answer 2 · answered Apr 08 '12 at 17:50

1

If you are interested in boosting the performance of your script cython could be an option.

answered Apr 08 '12 at 17:50

Krzysztof Rosiński

1,488
1
11
24

score 1 · Answer 3 · edited May 23 '17 at 10:08

I am not an expert on the FFT, but my understanding is that the FFT is simply a fast way to compute the DFT. So to me your question sounds like you are trying to write a bubble sort algorithm to see if it gives a better answer than quicksort. They are both sorting algorithms that would give the same result!

So I am questioning your basic premise. I am wondering if you can just change your rounding on your data and get the same result from the SciPy FFT code.

Also, according to my DSP textbook, the FFT can produce a more accurate answer than computing the DFT the long way, simply because floating point operations are inexact, and the FFT invokes fewer floating point operations along the way to finding the correct answer.

If you have some working C code that does the calculation you want, you could always wrap the C code to let you call it from Python. Discussion here: Wrapping a C library in Python: C, Cython or ctypes?

To answer your actual question: as @ZoZo123 noted, it would be a big win to change from range() to xrange(). With range(), Python has to build a list of numbers, and then destroy the list when done; with xrange() Python just makes an iterator that yields up the numbers one at a time. (But note that in Python 3.x, range() makes an iterator and there is no xrange().)

Also, if this code does not have to integrate with the rest of your code, you might try running this code under PyPy. This is exactly the sort of code that PyPy can best optimize. The problem with PyPy is that currently your project must be "pure" Python, and it looks like you are using NumPy. (There are projects to get NumPy and PyPy to work together, but that's not done yet.) http://pypy.org/

If this code does need to integrate with the rest of your code, then I think you need to look at Cython (as noted by @Krzysztof Rosiński).

Actually, PyPy supports a lot of basic Numpy features already, so this code should just work after `import numpypy` and `import numpy` (see http://buildbot.pypy.org/numpy-status/latest.html). — TryPyPy, Apr 08 '12 at 20:20
@TryPyPy, my understanding is that PyPy support for NumPy stuff is still experimental and evolving. So if he has a large body of SciPy code, I doubt he can just drop all of it into PyPy; but he should be able to run the shown example code. Thus my comment about "if this code does not have to integrate with the rest of your code". I look forward to the bright future when PyPy runs all of SciPy and all the scientists use PyPy by default; but we aren't there yet. — steveha, Apr 08 '12 at 20:34

DFT in Python taking significantly longer than C

3 Answers3