3

Code where most of the execution time is spent performing two independent function evaluations is an obvious candidate to use two CPUs. I know how to do it in Python with multiprocessing, but only with the idiom if __name__ == '__main__': added to the entry point of the program. Is there a simpler way in modern Python (3.8.3 at time of writing)? Nothing there seems suitably simple.

Requirements: no change to the calling code, and no separate file. It's fine to import some helper, but I would prefer not to pip.

Example application and benchmark, in the field of cryptography:

def rsacrt(p,q,dp,dq,qi,x):
    # most of the time is spent in the following two lines
    u = pow(x,dp,p)
    v = pow(x,dq,q)
    return (u-v)*qi%p*q+v

# test and benchmark the above
import time
e,p,q = 3, 5**3528+12436, 7**2918+27562
n,dp,dq,qi = p*q, pow(e,-1,p-1), pow(e,-1,q-1), pow(q,-1,p)
x = 42
t = time.time()
y = rsacrt(p,q,dp,dq,qi,x)
t = time.time()-t
if pow(y,e,n)!=x: print("# wrongo, spasmoid!")
print("duration of rsacrt:",(int)(t*1000.),"ms")

The operation shown is the one bottleneck in RSA signature generation, and RSA decryption. Parameters are deliberately high (16384-bit RSA, rather than the usual 2048-bit), so the execution time is in the order of seconds, with >98% in the two first pow. This is meant to illustrate a real-life case where parallel execution matters, not as an example on how to do RSA: there are fast alternatives to pow, and this code lacks side-channel protection.

Note: This code requires a version of Python where pow can compute the modular inverse. That includes Python 3.8.x. Try it online!.

Addition: The code that works under Python 3 is sizably larger, see this other Try it online!.

fgrieu
  • 2,724
  • 1
  • 23
  • 53
  • What did you try? The `Pool` object with its `map` method looks exactly suited to your needs. – ojdo Jun 10 '20 at 07:19
  • @ojdo: I tried `multiprocessing`, and hate how it requires changes all over the code. I did not try Pool/map, but looking at this [answer](https://stackoverflow.com/a/56138825/903600) it is not good at CPU-bound tasks, and that's also the impression I get from [the docs](https://docs.python.org/3/library/concurrent.futures.html). I've heard of a "global interpreter lock". I'm out of my comfort zone, that's why I ask. – fgrieu Jun 10 '20 at 07:40
  • 1
    I tried your code and got: `ValueError: pow() 2nd argument cannot be negative when 3rd argument specified` in line `n,dp,dq...`. Can you fix your example values? Also, I fear only two executions of `pow` will make it hard to pay back the overhead of spawning additional threads/processes/... . Can you quantify "most"? On my machine, running any parallel processing adds (one-time) overhead in the ms range. So unless you show the loop that calls rsacrt *often*, there is nothing to be gained at this level. – ojdo Jun 10 '20 at 08:46
  • @odjo: The reason of the error you experience is now explained in note, with link to an environement where it does not occur, and another link to a version that works under earlier Python 3. – fgrieu Jun 10 '20 at 14:06
  • What does this have to do with avoiding ``if __name__ == '__main__':``? It is only needed at the top-level of a module. – MisterMiyagi Jun 10 '20 at 14:16
  • @MisterMiyagi: my code is intended for publication in a peer-reviewed journal. I'd like to show Python at its best. Having to add `if __name__ == '__main__'` (or similar with `current_process().name` ) is not. If I do not get something cleaner I'll drop either Python, or parallel evaluation. – fgrieu Jun 10 '20 at 14:33
  • You seem to misunderstand. There is no need for ``if __name__ == '__main__':`` unless the parallelisation happens at module scope. – MisterMiyagi Jun 10 '20 at 14:35

1 Answers1

2

When using multiprocessing, the if __name__ == '__main__' protection is only needed for the top-level scope of modules. multiprocessing and derived modules can be used directly in functions.

Use a multiprocessing.Pool or a concurrent.futures.ProcessPoolExecutor for a concise API:

def rsacrt(p,q,dp,dq,qi,x):
    with concurrent.futures.ProcessPoolExecutor() as executor:
        u, v = executor.map(pow, *zip((x, dp, p), (x, dq, q)))
    return (u-v)*qi%p*q+v

This roughly speeds up rsacrt by a factor of 2.

MisterMiyagi
  • 44,374
  • 10
  • 104
  • 119
  • This is exactly what I want. Sadly I get "concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.", under Windows, Python 3.8.3 run from Idle (there are more errors when run from the command line, but the outcome is the same). See this [pastebin](https://pastebin.com/wDzvCKHc). I guess the problem is that the main is not guarded by that pesky idiom... – fgrieu Jun 10 '20 at 14:51
  • Please do not use IDLE. It takes certain shortcuts (automatic imports, pipe redirections) that will break code in non-obvious ways. – MisterMiyagi Jun 10 '20 at 15:40
  • @fgrieu The theoretically achievable runtime via parallelisation is 50% of the original. 1039ms from 1726ms is already roughly 60%. You will not realistically get more, since multiprocessing itself requires time at the scale of dozen to hundred ms. – MisterMiyagi Jun 10 '20 at 15:45
  • @fgrieu What do you mean by "still need that pesky ``if __name__ == '__main__':``"? There is no such check required anywhere here. Do you *define* the function in the ``__main__`` module? If so, you may want to have a look at [How to deal with python3 multiprocessing in ``__main__.py``](https://stackoverflow.com/questions/62206156/how-to-deal-with-python3-multiprocessing-in-main-py). If you want to know how to multiprocess functions that *must* be defined in ``__main__``, that is something completely different to what the question currently asks. – MisterMiyagi Jun 10 '20 at 17:37
  • I mean that when I put the question's code in a file, it can be run independently. But if I plug the question's substitute (with the necessary `import concurrent.futures`), it doesn't, until I guard the test code into a `if __name__ == '__main__':`block or similar. – fgrieu Jun 11 '20 at 05:07
  • @fgrieu I'm very much aware of what you mean. See the linked question for details on that. If that is your goal, please edit your question to clearly state the code must be all in one file. I'll happily remove my answer as not applicable in that case. – MisterMiyagi Jun 11 '20 at 06:51
  • I have added "no separate file" in the question (some time ago, as soon as I understood that was important, thanks to your answer). **Please leave your answer, it was useful to me** both for that, and as a way to use `concurrent.futures` in my case. – fgrieu Jun 11 '20 at 06:55