0

I used the elements shift code from here. It works good. The piece of code I have used is as follows:

# https://stackoverflow.com/a/62841583/4272651
@nb.njit
def shift(arr, num, fill_value=False):
    result = np.empty_like(arr)
    if num > 0:
        result[:num] = fill_value
        result[num:] = arr[:-num]
    elif num < 0:
        result[num:] = fill_value
        result[:num] = arr[-num:]
    else:
        result[:] = arr
    return result

So I use it to do some bit operations and get what I want. Everything works great.

Here is the execution time vs the array size plot

enter image description here

The graphs are shifted by a certain x so that it is easier to see the overlap.

So as you see shifting takes a lot of time when starting up, but then decreases and increases steadily as the array length increases.

What is the reason behind that initial spike?

Prasanna
  • 4,125
  • 18
  • 41
  • 1
    Well, generally this is a pattern you see with JIT compiled code, if you are running it sequentially... how is this being profiled? – juanpa.arrivillaga Sep 14 '20 at 14:52
  • 2
    The compilation must happen at the first run, so there is a time penalty. But runs after the first run are fast. – jkr Sep 14 '20 at 14:52
  • @juanpa.arrivillaga why is that the case though? Where do I read about it? – Prasanna Sep 14 '20 at 14:52
  • 1
    Anyway, I don't think `numba` is really doing anything for you here... you aren't really doing any looping, you are already using vectorized operations. Try it without the `numba.jit` – juanpa.arrivillaga Sep 14 '20 at 14:54
  • @juanpa.arrivillaga well that is a good observation. But someone on that thread already did a benchmark on the code and looks like it works pretty good. I will try without numba and see if the output changes – Prasanna Sep 14 '20 at 14:55
  • Numba accelerates your code but just-in-time compiling. On the first use of your function, it is compiled and then run. On the next use, the compiled version is used directly, so the first call takes more time than the others. – Niklas Mertsch Sep 14 '20 at 14:58
  • @Prasanna because *it costs time to actually do the compiling/optimizations*. That isn't free. Think about what is actually happening in JIT compilation. [They even talk about this in the numba docs when discussing how to profile your code](https://numba.pydata.org/numba-doc/latest/user/5minguide.html#how-to-measure-the-performance-of-numba) Note, `numba` aslo has an [ahead-of-time (AOT) compilation mode](https://numba.pydata.org/numba-doc/dev/user/pycc.html#compiling-code-ahead-of-time) although don't expect that to be necessarily better – juanpa.arrivillaga Sep 14 '20 at 14:59
  • @juanpa.arrivillaga turns out you were right. It was the numba thing. Removed it and now it works good – Prasanna Sep 14 '20 at 19:33

0 Answers0