5

Can someone explain why I obtain such strange and different times during array allocation, it turns out that it 25x times faster to allocate slightly bigger array and slice it, than to allocate the array of needed size:

%timeit arr = np.zeros((360, 360))
207 µs ± 4.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%%timeit
arr = np.zeros((362, 362))
arr = arr[:360]
8.4 µs ± 651 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Is there something general behind that or it is some Windows related issue? While this situation is specific near (360, 360) size on my computer, I do not know if it can arise at other places.

EDIT: While this question is marked as a duplicate, the answer to that question does not not fully explain the problem:

%timeit -n10 -r10 arr = np.zeros((361, 361))
243 µs ± 56.6 µs per loop (mean ± std. dev. of 10 runs, 10 loops each)

%timeit -n10 -r10 arr = np.zeros((362, 362))
6.82 µs ± 539 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

With zeros there is 25-35x regression, but with empty the story is in opposite direction with 2-5x:

%timeit -n10 -r10 arr = np.empty((361, 361))
2.49 µs ± 1.02 µs per loop (mean ± std. dev. of 10 runs, 10 loops each)

%timeit -n10 -r10 arr = np.empty((362, 362))
11.9 µs ± 1.58 µs per loop (mean ± std. dev. of 10 runs, 10 loops each)

Windows 7
Python 3.6.3
numpy 1.13.3

godaygo
  • 2,215
  • 2
  • 16
  • 33
  • I need to allocate a lot of arrays with the size `(360,360)` during my calculations, so this accumulated overhead beats me very hard. – godaygo Dec 21 '17 at 21:28
  • 1
    Stupid question, but are the numbers consistent? Can you reproduce this 25x every time? Or maybe the first needed some time to evict pages from RAM to allocate the required space? – Matt Clark Dec 21 '17 at 21:33
  • @sascha the results are the same. – godaygo Dec 21 '17 at 21:37
  • What happens if you do the `362` case first? – hpaulj Dec 21 '17 at 21:39
  • 1
    @MattClark yes they are consistent, if we consider not this naked version, this difference yields me *25%* overhead. – godaygo Dec 21 '17 at 21:43
  • @hpaulj the results are the same, but if I run `%timeit -n1 -r1` in the loop the usual difference is about `10x`. – godaygo Dec 21 '17 at 21:45
  • 1
    The post marked as a duplicate seems to explain this behavior. Once you cross a threshold it no longer actually allocates the memory. – Matt Clark Dec 21 '17 at 21:48
  • When I try to reproduce your example in the interpreter I get a slightly longer time for your second case. Same on [TIO](https://tio.run/##jY1LCsMwDET3PoXJygIT8oEuWnyS0oUDJjXEtpDVRXp5VyU5QAQjIWl4gzu/S55biwkLsc6fhLv2VWdU54ljCpGVQoqZzbH1FDB4Np0n0k7M/TdQqcbMt8GKADr7Zy2B3DhIWb1uZfFbdec0AHAVOQlyAngcH@nPu2S8rkS09gM) – import random Dec 21 '17 at 21:59

0 Answers0