Is there any way I can iterate through an array faster than a for loop?

Question

I'm writing a code that compares fluxes of pixels on an astronomical map with the corresponding area on another one. Both maps are numpy arrays of data.

In order to do that, I need to transform pixel indexes on the first map (Av) to their equivalent on sky coordinates, then transform those sky coordinates to their pixel indexes equivalent on the second map (CO). Then, I scale the fluxes of the second map to match the values of the first map. After that, I have to keep handling the data.

The problem is that with thousands of pixels on the first map, the code is taking a really long time to finish doing what it's supposed to do, which is a hassle for troubleshooting. I've figured out that the slowest thing on this part of the code is the for loop.

Is there any way I can iterate through a numpy array, being able to work with the indexes and calculate data from every pixel, faster than a for loop? Is there a better way to do this at all?

In pseudocode, my code is something like this:

for pixel i,j in 1st map:
    sky_x1,sky_y1 = pixel_2_skycoord(i,j)
    i2,j2 = skycoord_2_pixel(sky_x1,sky_y1)

    Avmap.append(Avflux[i,j])
    COmap.append(COflux[i2,j2]*scale)

The actual code is:

for i in xrange(0,sAv_y-1):
    for j in xrange(0,sAv_x-1):
        if not np.isnan(Avdata[i,j]):           

            y,x=wcs.utils.skycoord_to_pixel(wcs.utils.pixel_to_skycoord(i,j,wAv,0),wcs=wCO)

            x=x.astype(int)+0 #the zero is because i don't understand the problem with numpy but it fixes it anyway
            y=y.astype(int)+0 #i couldn't get the number from an array with 1 value but adding zero resolves it somehow
            COflux=COdata[x,y]

            ylist.append(Avdata[i,j])
            xlist.append(COflux*(AvArea/COArea))

BTW: this is incorrect: `if (Avdata[i,j] is not np.nan)` You have to use something like `np.isnan` — juanpa.arrivillaga, Mar 21 '19 at 05:03

score 3 · Accepted Answer · edited Mar 21 '19 at 07:28

3

The culprit here is the two for loops. Numpy has many functions that prevent the use of for loops to allow fast compiled code. The trick is to vectorize your code.

You can look into numpy's meshgrid function to convert this data into a vectorized form that you can then use something like this SO question to apply an arbitrary function to that vector.

Something along the lines of:

x_width = 15
y_width = 10

x, y = np.meshgrid(range(x_width), range(y_width))

def translate(x, y, x_o, y_o):
    x_new = x + x_o
    y_new = y + y_o
    return x_new, y_new

x_new, y_new = translate(x, y, 3, 3)

x_new[4,5], y[4,5]

(8, 4)

edited Mar 21 '19 at 07:28

computersciencestudent

23
1

answered Mar 21 '19 at 04:51

Jurgen Strydom

3,540
1
23
30

This, and the question you shared, was really helpful. It wasn't a solution though, because now that I was not iterating through an array but applying a function over a whole chunk of memory my 8GB of RAM, and some more of Swap Memory, weren't up to the task. I find this solution works when you can spare RAM in order to save time. I will certainly use this in the future. – condosz Mar 24 '19 at 18:27
Sure that is the tradeoff you have most of the time, CPU vs RAM. I'm surprised though that something that maps to pixels on a screen would fill your ram. There may be other optimizations to make (maybe working in place instead of copying arrays). I'm glad this helped. – Jurgen Strydom Mar 25 '19 at 06:56

score 1 · Answer 2 · answered Mar 21 '19 at 23:19

You must avoid loops, and do the heavy computation in the underlying C code, in Numpy or in Astropy for the sky/pixel conversion. There are several options to do this with astropy.wcs.

The first one is with SkyCoord. Let's first create a grid of value for your pixel indices:

In [30]: xx, yy = np.mgrid[:5, :5] 
    ...: xx, yy 
Out[30]: 
(array([[0, 0, 0, 0, 0],
        [1, 1, 1, 1, 1],
        [2, 2, 2, 2, 2],
        [3, 3, 3, 3, 3],
        [4, 4, 4, 4, 4]]), array([[0, 1, 2, 3, 4],
        [0, 1, 2, 3, 4],
        [0, 1, 2, 3, 4],
        [0, 1, 2, 3, 4],
        [0, 1, 2, 3, 4]]))

Now we can create the SkyCoord object (which is a Numpy array subclass), from the pixels indices, and using the wcs:

In [33]: from astropy.coordinates import SkyCoord 
    ...: sky = SkyCoord.from_pixel(xx, yy, wcs) 
    ...: sky                                                                                   
Out[33]: 
<SkyCoord (FK5: equinox=2000.0): (ra, dec) in deg
    [[(53.17127889, -27.78771333), (53.17127889, -27.78765778),
      (53.17127889, -27.78760222), (53.17127889, -27.78754667),
      (53.17127889, -27.78749111)],
      ....

Note that this is using wcs.utils.skycoord_to_pixel. This object also has a method to project to pixel with a wcs. I will the same here for practical purpose:

In [34]: sky.to_pixel(wcs)                                                                     
Out[34]: 
(array([[ 0.00000000e+00, -1.11022302e-16, -2.22044605e-16,
         -3.33066907e-16,  1.13149046e-10],
        ...
        [ 4.00000000e+00,  4.00000000e+00,  4.00000000e+00,
          4.00000000e+00,  4.00000000e+00]]),
 array([[-6.31503738e-11,  1.00000000e+00,  2.00000000e+00,
          3.00000000e+00,  4.00000000e+00],
        ...
        [-1.11457732e-10,  1.00000000e+00,  2.00000000e+00,
          3.00000000e+00,  4.00000000e+00]]))

We get a tuple of float values for the new x and y indices. So you will need to round these values and convert to int to use that as array indices.

The second option is to use the lower level functions, e.g. wcs.pixel_to_world_values and wcs.world_to_pixel_values, which takes Nx2 arrays and return this as well:

In [37]: wcs.pixel_to_world_values(np.array([xx.ravel(), yy.ravel()]).T)                       
Out[37]: 
array([[ 53.17127889, -27.78771333],
       [ 53.17127889, -27.78765778],
       [ 53.17127889, -27.78760222],
       [ 53.17127889, -27.78754667],
       ...

Thank you, I find these packages hard to understand since I cannot take the time to study how they work before having to use them quickly. — condosz, Mar 24 '19 at 18:29

Is there any way I can iterate through an array faster than a for loop?

2 Answers2