I just noticed this question via your Meta SO question: Is it appropriate to share the results of my research toward solving my own minor questions?. As you mention in that question (and one of the tags on this question indicates), this sort of thing falls into the category of micro-optimization. Ideally, one shouldn't need to worry about such minor performance differences, and as Knuth says, premature optimization is the root of all evil. But I guess it is fun and instructive to investigate such matters, as it can give you a better feel for how Python works "under the hood".
Mephy's comment prompted me to see what the timing differences were for if
-less versions of your functions. The results are interesting, IMHO. I also took the opportunity to streamline your testing procedure.
#!/usr/bin/env python
''' Do timeit tests on various implementations of NAND
NAND returns True if either argument is falsey, else False.
From https://stackoverflow.com/q/29551438/4014959
Written by PM 2Ring 2015.04.09
'''
from timeit import Timer
import dis
def and_chk(a, b):
return not (a and b)
def and_chk_if(a, b):
if not (a and b):
return True
else:
return False
def not_or_chk(a, b):
return not a or not b
def not_or_chk_if(a, b):
if not a or not b:
return True
else:
return False
#All the functions
funcs = (
and_chk,
and_chk_if,
not_or_chk,
not_or_chk_if,
)
#Argument tuples to test the functions with
bools = (0, 1)
bool_tups = [(u, v) for u in bools for v in bools]
def show_dis():
''' Show the disassembly for each function '''
print 'Disassembly'
for func in funcs:
fname = func.func_name
print '\n%s' % fname
dis.dis(func)
print
def verify():
''' Verify that the functions actually perform as intended '''
print 'Verifying...'
for func in funcs:
fname = func.func_name
print '\n%s' % fname
for args in bool_tups:
print args, func(*args)
print
def time_test(loops, reps):
''' Print timing stats for all the functions '''
print 'Timing tests\nLoops = %d, Repetitions = %d' % (loops, reps)
for func in funcs:
fname = func.func_name
print '\n%s' % fname
setup = 'from __main__ import %s' % fname
for args in bool_tups:
t = Timer('%s%s' % (fname, args), setup)
r = t.repeat(reps, loops)
r.sort()
print args, r
show_dis()
verify()
time_test(loops=520000, reps=3)
output
Disassembly
and_chk
13 0 LOAD_FAST 0 (a)
3 JUMP_IF_FALSE 4 (to 10)
6 POP_TOP
7 LOAD_FAST 1 (b)
>> 10 UNARY_NOT
11 RETURN_VALUE
and_chk_if
16 0 LOAD_FAST 0 (a)
3 JUMP_IF_FALSE 4 (to 10)
6 POP_TOP
7 LOAD_FAST 1 (b)
>> 10 JUMP_IF_TRUE 5 (to 18)
13 POP_TOP
17 14 LOAD_GLOBAL 0 (True)
17 RETURN_VALUE
>> 18 POP_TOP
19 19 LOAD_GLOBAL 1 (False)
22 RETURN_VALUE
23 LOAD_CONST 0 (None)
26 RETURN_VALUE
not_or_chk
22 0 LOAD_FAST 0 (a)
3 UNARY_NOT
4 JUMP_IF_TRUE 5 (to 12)
7 POP_TOP
8 LOAD_FAST 1 (b)
11 UNARY_NOT
>> 12 RETURN_VALUE
not_or_chk_if
25 0 LOAD_FAST 0 (a)
3 UNARY_NOT
4 JUMP_IF_TRUE 8 (to 15)
7 POP_TOP
8 LOAD_FAST 1 (b)
11 UNARY_NOT
12 JUMP_IF_FALSE 5 (to 20)
>> 15 POP_TOP
26 16 LOAD_GLOBAL 0 (True)
19 RETURN_VALUE
>> 20 POP_TOP
28 21 LOAD_GLOBAL 1 (False)
24 RETURN_VALUE
25 LOAD_CONST 0 (None)
28 RETURN_VALUE
Verifying...
and_chk
(0, 0) True
(0, 1) True
(1, 0) True
(1, 1) False
and_chk_if
(0, 0) True
(0, 1) True
(1, 0) True
(1, 1) False
not_or_chk
(0, 0) True
(0, 1) True
(1, 0) True
(1, 1) False
not_or_chk_if
(0, 0) True
(0, 1) True
(1, 0) True
(1, 1) False
Timing tests
Loops = 520000, Repetitions = 3
and_chk
(0, 0) [0.36773014068603516, 0.37793493270874023, 0.38387489318847656]
(0, 1) [0.36292791366577148, 0.37119913101196289, 0.37146902084350586]
(1, 0) [0.38673520088195801, 0.39340090751647949, 0.39670205116271973]
(1, 1) [0.38938498497009277, 0.39505791664123535, 0.40034103393554688]
and_chk_if
(0, 0) [0.4021449089050293, 0.40345501899719238, 0.41558098793029785]
(0, 1) [0.40686416625976562, 0.41213202476501465, 0.44800615310668945]
(1, 0) [0.4281308650970459, 0.42916202545166016, 0.43681907653808594]
(1, 1) [0.46246123313903809, 0.46759700775146484, 0.47544980049133301]
not_or_chk
(0, 0) [0.35435700416564941, 0.36368083953857422, 0.36867713928222656]
(0, 1) [0.35602092742919922, 0.35732793807983398, 0.36237406730651855]
(1, 0) [0.39550518989562988, 0.40660715103149414, 0.40977287292480469]
(1, 1) [0.4060060977935791, 0.4112389087677002, 0.41296815872192383]
not_or_chk_if
(0, 0) [0.4308779239654541, 0.44109201431274414, 0.45827698707580566]
(0, 1) [0.43600606918334961, 0.4370419979095459, 0.47623395919799805]
(1, 0) [0.48452401161193848, 0.48769593238830566, 0.49147915840148926]
(1, 1) [0.53107500076293945, 0.54048299789428711, 0.55434417724609375]
These timings were performed using Python 2.6.6 on a 2GHz Pentium 4 (single-core 32 bit) running Mepis 11 (a Debian family Linux distro).
You'll note that I've avoided using your next(tups)
strategy to get the arguments for each function call and instead I'm passing the args directly, as constants. The time taken to perform next(tups)
should be fairly constant, but it's best to eliminate such overheads, when practical, so that the reported time measurements more accurately reflect the performance of the code we're really interested in.
Also, it's usual to perform several repetitions of the timing loop and take the minimum value; FWIW, I generally do 3 to 5 reps. From the timeit docs:
Note
It’s tempting to calculate mean and standard deviation from the result
vector and report these. However, this is not very useful. In a
typical case, the lowest value gives a lower bound for how fast your
machine can run the given code snippet; higher values in the result
vector are typically not caused by variability in Python’s speed, but
by other processes interfering with your timing accuracy. So the min()
of the result is probably the only number you should be interested in.
After that, you should look at the entire vector and apply common
sense rather than statistics.
Your Meta post says that you want to perform & report on other micro-optimization experiments, so you might be interested in taking a look at some code I posted a few months ago that does time tests on various implementations of the factorial function.