3

The python pint module implements physical quantities. I would like to use it together with multiprocessing. However, I don't know how to handle creating a UnitRegistry in the new process. If I do the intuitive:

from multiprocessing import Process
from pint import UnitRegistry, set_application_registry

ureg = UnitRegistry()
set_application_registry(ureg)
Q = ureg.Quantity


def f(one, two):
    print(one / two)

if __name__ == '__main__':
    p = Process(target=f, args=(Q(50, 'ms'), Q(50, 'ns')))
    p.start()
    p.join()

Then I get an the following exception:

Traceback (most recent call last):
File "C:\WinPython-64bit-3.4.4.2Qt5\python-3.4.4.amd64\lib\multiprocessing\process.py", line 254, in _bootstrap
    self.run()
File "C:\WinPython-64bit-3.4.4.2Qt5\python-3.4.4.amd64\lib\multiprocessing\process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
File "C:\Users\pmaunz\PyCharmProjects\IonControl34\tests\pintmultiprocessing.py", line 12, in f
    print(one / two)
File "C:\WinPython-64bit-3.4.4.2Qt5\python-3.4.4.amd64\lib\site-packages\pint\quantity.py", line 738, in __truediv__
    return self._mul_div(other, operator.truediv)
File "C:\WinPython-64bit-3.4.4.2Qt5\python-3.4.4.amd64\lib\site-packages\pint\quantity.py", line 675, in _mul_div
    offset_units_self = self._get_non_multiplicative_units()
File "C:\WinPython-64bit-3.4.4.2Qt5\python-3.4.4.amd64\lib\site-packages\pint\quantity.py", line 1312, in _get_non_multiplicative_units
    offset_units = [unit for unit in self._units.keys()
File "C:\WinPython-64bit-3.4.4.2Qt5\python-3.4.4.amd64\lib\site-packages\pint\quantity.py", line 1313, in <listcomp>
    if not self._REGISTRY._units[unit].is_multiplicative]
KeyError: 'millisecond'

Which I assume originates from the lack of initializing the UnitRegistry on the child process before unpickling the arguments. (Initializing the UnitRegistry in the function f does not work, as the variables have already been unpickled).

How would I go about sending a pint Quantity to a child process?

Edit after Tim Peter's answer:

The problem is not tied to multiprocessing. Simply pickling quantities

from pint import UnitRegistry, set_application_registry
import pickle
ureg = UnitRegistry()
set_application_registry(ureg)
Q = ureg.Quantity
with open("pint.pkl", 'wb') as f:
    pickle.dump(Q(50, 'ms'), f)
    pickle.dump(Q(50, 'ns'), f)

and then unpickling in a new script leads to the same problem:

from pint import UnitRegistry, set_application_registry 
import pickle
ureg = UnitRegistry()
set_application_registry(ureg)
Q = ureg.Quantity
with open("pint.pkl", 'rb') as f:
    t1 = pickle.load(f)
    t2 = pickle.load(f)

print(t1 / t2)

results in the same exception. As Tim points out, it is sufficient to add a line Q(50, 'ns'); Q(50, 'ms') before unpickling. When digging into the source code for pint, upon creation of a quantity with unit ms this unit is added to an internal registry. Pickling uses a UnitContainer instance to save the units. When creating a Quantity via unpickling the unit is not added to the registry.

A simple fix (in pint source code) is to change the function Quantity.__reduce__ to return a string.

diff --git a/pint/quantity.py b/pint/quantity.py
index 3f30a25..695866a 100644
--- a/pint/quantity.py
+++ b/pint/quantity.py
@@ -57,7 +57,7 @@ class _Quantity(SharedRegistryObject):

     def __reduce__(self):
         from . import _build_quantity
-        return _build_quantity, (self.magnitude, self._units)
+        return _build_quantity, (self.magnitude, str(self._units))

     def __new__(cls, value, units=None):
         if units is None:

I have opened an issue on pint's github site.

Peter
  • 543
  • 1
  • 5
  • 13
  • Bravo! That all makes great sense. I'm only surprised that nobody bumped into this before - pickling in one process and unpickling in another is the _usual_ way pickles are used (even in the absence of `multiprocessing`)! – Tim Peters Jul 17 '16 at 16:31

1 Answers1

3

I never used pint before, but this looked interesting ;-) First thing I noted is that I have no problem if I stick to units explicitly listed by this line:

print(dir(ureg.sys.mks))

For example, "hour" and "second" are both in the output of that, and your program runs fine if the Process line is changed to:

p = Process(target=f, args=(Q(50, 'hour'), Q(50, 'second')))

You're on Windows, so multiprocessing is using the "spawn" method: the entire program is imported fresh by the worker process, so in particular the:

ureg = UnitRegistry()
set_application_registry(ureg)
Q = ureg.Quantity

lines were executed in the worker process too. So the unit registry is initialized in the worker, but it's not the same (identical) registry used in the main program - no memory is shared between processes.

To get much deeper we really need an expert in how pint is implemented. My guess is that for units "made up" (not in the output produced by the dir() line above) by parsing strings, new stuff is added to the registry at some level, which is needed later to reconstruct the values. "ns" and "ms" are of this nature: they are not in the dir() output.

Your program works fine as-is if I add a line like this immediately after your Q=ureg.Quantity line:

Q(1, 'ms'); Q(1, 'ns')

That was a shot in the dark (an "educated guess") that worked: it just forced the worker process to parse the same "made up" units used in the main process, to try to force its unit registry into a similar state.

I hope there's a cleaner way to get it to work, but can't help more. I'd ask the pint authors about it.

Tim Peters
  • 67,464
  • 13
  • 126
  • 132