timeit.repeat much faster than tic-toc even for one operation?

Question

I have a simple function fun and I'd like to measure how long it takes and also get the value back. I thought Python's "tic-toc" is the way to go, but I noticed that it's significanly slower than timeit (which doesn't return the value):

import numpy as np
import time
import timeit

np.random.seed(0)
x = np.random.rand(10000)

def fun():
    return np.sqrt(x)


t0_ns = time.time_ns()
fun()
t1_ns = time.time_ns()
print(t1_ns - t0_ns)


out = timeit.repeat(
  stmt=fun,
  repeat=1,
  number=1,
  timer=time.perf_counter_ns,
)
print(out[0])

58002
27037

In fact, timeit.repeat is more than twice as fast as the function evaluation itself! What's the reason for this? Is there a way around it?

Doesn't timein use a repeat-loop and take the average, so it can avoid measurement overhead? That's going to be most of the cost of your timed region for something simple like a square root. — Peter Cordes, Mar 27 '21 at 16:27
@PeterCordes - I agree. OP - if you run this code multiple times, you will get a range of times, and sometimes the "tic-toc" method will show a faster runtime than timeit. — jkr, Mar 27 '21 at 16:28
@jakub: That's not what I meant. I mean that measuring the time for one run is going to include more overhead than measuring the time for 100 runs and dividing by 100, especially if you use some efficient way to get the function called repeatedly. (And/or if timeit tries to measure the measurement overhead, e.g. with an empty timed region). Oh, but I guess `repeat=1` and `number=1` tell it not to actually repeat? Still, it might be trying to account for / subtract measurement overhead. — Peter Cordes, Mar 27 '21 at 16:31
Another factor is possible warm-up effects like CPU frequency not jumping to max turbo right away. ([Idiomatic way of performance evaluation?](https://stackoverflow.com/q/60291987)). (Try is doing the timeit test first, and see if the manually-timed call is any faster when timeit has already called fun once.) — Peter Cordes, Mar 27 '21 at 16:34
@PeterCordes That's it. When appending another "tic-toc", it's must faster again. If you promote your reply to an anwer, I'll be happy to accept it. — Nico Schlömer, Mar 27 '21 at 16:37
Does this answer your question? [Idiomatic way of performance evaluation?](https://stackoverflow.com/questions/60291987/idiomatic-way-of-performance-evaluation) — Peter Cordes, Mar 27 '21 at 16:47
That warm-up point is one of the key points in my answer on [Idiomatic way of performance evaluation?](https://stackoverflow.com/questions/60291987/idiomatic-way-of-performance-evaluation), including the litmus test of trying the other order, so I think this is a duplicate. — Peter Cordes, Mar 27 '21 at 16:48

score 0 · Accepted Answer · answered Apr 25 '21 at 09:23

Promoting Peter Cordes' reply to an answer here.

When reverting the order,

import numpy as np
import time
import timeit

np.random.seed(0)
x = np.random.rand(10000)


def fun():
    return np.sqrt(x)


out = timeit.repeat(
    stmt=fun,
    repeat=1,
    number=1,
    timer=time.perf_counter_ns,
)
print(out[0])

t0_ns = time.time_ns()
fun()
t1_ns = time.time_ns()
print(t1_ns - t0_ns)

36709
12623

the second operation is faster again. This makes clear that the execution speed has nothing to do with the timing methods, but with CPU warm-up effects.

timeit.repeat much faster than tic-toc even for one operation?

1 Answers1