In comparing math.sqrt(…)
to the built-in binary operator, … ** 0.5
in a REPL using timeit.timeit(…)
, it looks as though the binary op has a slight edge over the math
module function:
… although I had to crank the number of iterations up to 1,000,000 to be able to get visible differences (note that I did compensate for the module-function lookup overhead by directly importing sqrt
from math
in the relevant test).
In this case, I haven’t yet delved into checking against more evidently off-the-shelf optimized codepaths – e.g. calling libm
functions via an extension or a cython
module – as I am trying to stick to my New Year’s resolution not to prematurely optimize more than once a week; unlike other related-question-askers, I’m after integer-result correctness before performance.
This is specifically an operation in the hot loop of a currently synchronous image-processing function that uses PIL/Pillow to render an image as a monotone dot-screen halftone‡, which right now is about two orders of magnitude too slow; it can’t be readily cythion
-ized or numpy
-ized due to currently using Pillow’s vector-drawing API (but FYI, I am not afraid to go the distance with those elements of the problem; for example I already replaced calls into the Pillow ImageStat.Stat
package with faster and less overhead-y bespoke functions, which doing that was a demonstrable and not at all premature act of optimization).
My question is: what are the important advantages and disadvantages of both of these methods (and/or any additional square-root methodologies I didn’t name) – both in the general sense as well as those that are specifically germane to the Python math environment? And, more importantly, are these issues eclipsed by other issues, such as the aforementioned overhead concerns like attribute-lookup, function dispatch, and the like?
I have often found that there are straightforward performance answers to each of these elements as a unit, but when Python code needs to meet performance standards, all of these sub-concerns ride the line through a gray area of misunderstanding – I therefore do especially hope to hear from anyone who has worked on a problem such as this and understands what to take seriously.
‡ The lineage of the halftone code itself stems from this excellent and time-tested SO answer.