For more setup, see this question. I want to create lots of instances of class Toy
, in parallel. Then I want to write them to an xml tree.
import itertools
import pandas as pd
import lxml.etree as et
import numpy as np
import sys
import multiprocessing as mp
def make_toys(df):
l = []
for index, row in df.iterrows():
toys = [Toy(row) for _ in range(row['number'])]
l += [x for x in toys if x is not None]
return l
class Toy(object):
def __new__(cls, *args, **kwargs):
if np.random.uniform() <= 1:
return super(Toy, cls).__new__(cls, *args, **kwargs)
def __init__(self, row):
self.id = None
self.type = row['type']
def set_id(self, x):
self.id = x
def write(self, tree):
et.SubElement(tree, "toy", attrib={'id': str(self.id), 'type': self.type})
if __name__ == "__main__":
table = pd.DataFrame({
'type': ['a', 'b', 'c', 'd'],
'number': [5, 4, 3, 10]})
n_cores = 2
split_df = np.array_split(table, n_cores)
p = mp.Pool(n_cores)
pool_results = p.map(make_toys, split_df)
p.close()
p.join()
l = [a for L in pool_results for a in L]
box = et.Element("box")
box_file = et.ElementTree(box)
for i, toy in itertools.izip(range(len(l)), l):
Toy.set_id(toy, i)
[Toy.write(x, box) for x in l]
box_file.write(sys.stdout, pretty_print=True)
This code runs beautifully. But I redefined the __new__
method to only have a random chance of instantiating a class. So if I set if np.random.uniform() < 0.5
, I want to create half as many instances as I asked for, randomly determined. Doing this returns the following error:
Exception in thread Thread-3:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 763, in run
self.__target(*self.__args, **self.__kwargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 380, in _handle_results
task = get()
AttributeError: 'NoneType' object has no attribute '__dict__'
I don't know what this even means, or how to avoid it. If I do this process monolithically, as in l = make_toys(table)
, it runs well for any random chance.
Another solution
By the way, I know that this can be solved by leaving the __new__
method alone and instead rewriting make_toys()
as
def make_toys(df):
l = []
for index, row in df.iterrows():
prob = np.random.binomial(row['number'], 0.1)
toys = [Toy(row) for _ in range(prob)]
l += [x for x in toys if x is not None]
return l
But I'm trying to learn about the error.