I have written a Frequent Pattern Growth algorithm and figured that after building the full tree which is quite large the counting part which is the most ressource-heavy one could be parallelized.
An FP Tree is a non-binary tree structure where the same name can appear at different nodes. Each node connects to the next node of the same name, so the tree can not only be traversed from the root, but also from the side following a link path that connect all nodes with the same name. From each such node a traversal back to the root node is done while building and counting the possible combinations of that path.
By splitting up the names to follow I figured I could do multiprocessing on the counting part since the algorithm just follows a certain path in the tree structure and does the counting and combining without altering it.
But now I'm hitting a roadblock with an unexpected error from multiprocessing I don't understand. Take the following sample code. It has a class called link
which basically just takes a link on another instance of itself. I build two chains of such classes with a given depth, that is each link.link
variable contains the next link
-instance while the first one is None
.
import multiprocessing as mpr
class link:
def __init__(self,link=None):
self.link = link
def main(maxRange):
nproc = 2
L = []
for i in range(nproc):
L.append(link())
for j in range(maxRange):
L[i] = link(L[i])
LL = [L[:1],L[1:]]
pool = mpr.Pool(processes=nproc)
pool.map(test,LL)
pool.close()
pool.join()
return
def test(l):
pass
if __name__ == "__main__":
maxRange = 328
main(maxRange)
Once the value for maxRange
reaches 328 I'm getting this error:
File "E:/PythonDir/Diverses/temp.py", line 896, in main
pool.map(test,dfl)
File "C:\Users\...\AppData\Local\Continuum\Anaconda2\lib\multiprocessing\pool.py", line 251, in map
return self.map_async(func, iterable, chunksize).get()
File "C:\Users\...\AppData\Local\Continuum\Anaconda2\lib\multiprocessing\pool.py", line 567, in get
raise self._value
RuntimeError: maximum recursion depth exceeded while getting the str of an object
Why does multiprocessing have a problem with objects that contain references to other objects?
Is there a way to get around this?
Why 328?