Very similar to this question I was wondering what is both more idiomatic and faster when using Cython: A try
/except KeyError
block or alternative solutions based on pop
and/or in
.
Test data:
from random import shuffle, randint
datal = list(map(float, range(1_000)))
shuffle(datal)
data = {
a * 10 ** (randint(0, 2) * 3): b
for a, b in zip(datal[:-1], datal[1:])
}
data[datal[-1] * 10 ** (randint(0, 2) * 3)] = datal[0]
Three variations in Cython (language level 3):
- Mostly dynamic typing,
pop
andNone
as a default return value:
def test_none(dict raw):
_, last_value = raw.popitem()
cdef list out = [last_value]
while len(raw) > 0:
next_value = raw.pop(last_value, None)
if next_value is not None:
out.append(next_value)
last_value = next_value
continue
next_value = raw.pop(last_value * 1_000, None)
if next_value is not None:
last_value = next_value
out.append(next_value)
continue
next_value = raw.pop(last_value * 1_000_000)
out.append(next_value)
last_value = next_value
return out
- With
in
andpop
:
def test_in(dict raw):
cdef double last_value, next_value, _
_, last_value = raw.popitem()
cdef list out = [last_value]
while len(raw) > 0:
if last_value in raw:
next_value = raw.pop(last_value)
elif last_value * 1_000 in raw:
next_value = raw.pop(last_value * 1_000)
else:
next_value = raw.pop(last_value * 1_000_000)
out.append(next_value)
last_value = next_value
return out
- With
pop
andtry/except KeyError
:
def test_te(dict raw):
cdef double last_value, next_value, _
_, last_value = raw.popitem()
cdef list out = [last_value]
while len(raw) > 0:
try:
next_value = raw.pop(last_value)
except KeyError:
try:
next_value = raw.pop(last_value * 1_000)
except KeyError:
next_value = raw.pop(last_value * 1_000_000)
out.append(next_value)
last_value = next_value
return out
With timeit
, I get the following results:
%timeit test_te(data.copy())
471 µs ± 1.57 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit test_in(data.copy())
219 µs ± 9.04 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit test_none(data.copy())
175 µs ± 401 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)