0

1. Introduction

Assuming there is a 3-d array in the shape of (365,100,100): Prec.

  • It represent the daily precipitation condition of an area for whole year.
  • Apparently, the first dimension represent the time series.
  • The last 2 dimensions represent the spatial distribution(for example, there are 10000 grids in size 1km x 1 km)

2. Attempt

Test each grid for whole area whether its precipitation above certain value Pd which separate dry and wet. I want to sum the dry day for the whole year.

3. My code

freq = np.zeros(100,100).reshape(100,100)
Pd = xxx
for i in range(0,prec.shape[0],1):
    for j in range(0,prec.shape[1],1):
        for k in range(0,prec.shape[2],1):   
            if prec[i,j,k] < Pd:
               freq[j,k] +=1

I think too many loop must waste time. Are there some cleanest way to achieve similar work?
Any advices would be appreciate!

Divakar
  • 218,885
  • 19
  • 262
  • 358
Han Zhengzu
  • 3,694
  • 7
  • 44
  • 94
  • 1
    Divakar's answer below is excellent. Personally, I think that for this type of stuff, `numpy` is too low level, and `pandas` is the way to go. – Ami Tavory Mar 26 '16 at 11:30
  • I'm only familiar with dataframe with `pandas`. I'll try n-d array in `pandas` some day! Thanks! – Han Zhengzu Mar 26 '16 at 11:35

1 Answers1

2

You are comparing and summing along the first axis of prec. That comparison could be performed using NumPy broadcasting in a vectorized manner and then sum along the first axis with .sum(0), like so -

freq = (prec < Pd).sum(0)
Divakar
  • 218,885
  • 19
  • 262
  • 358
  • 1
    No loops - very elegant. – Ami Tavory Mar 26 '16 at 11:29
  • May I ask what the principle of these kind of code? I have met with some similar code like `array[array > 0.5]`. The efficiency are so greater than loop. Why they are so fast? Don't they need loop to traverse all the value ? – Han Zhengzu Mar 26 '16 at 11:32
  • @HanZhengzu [This](http://stackoverflow.com/a/8385658/3293881) might shed some light. NumPy is built for performing the same operation on a huge number of elements in a vectorized fast way. I think the philosophy with it is [`SIMD`](https://en.wikipedia.org/wiki/SIMD). – Divakar Mar 26 '16 at 11:34