I've successfully run multiprocessing in python applying a function over a pandas series. For example:
p = Pool(4)
col2 = p.map(detect,df['text'])
p.close()
p.join()
print col2
Where detect is the function, df is a dataframe with a column called 'text'.
I've run it both using the python and ipython interpreter calling python script.py
or ipython script.py
. However, this seems to just freeze up in iPython Notebook.
I've seen this thread which I think might explain why it freezes up in iPython Notebook and so I have three follow-ups:
- Confirmation that this is indeed the case (since main is not an importable module while running iPython interactively)
- Is there a way to multiprocess/thread while in interactive python/ipython or in iPython Notebook
- Does anyone have faster alternatives to speeding up the pd.series.Apply() function within the realms of an iPython Notebook? The function itself is not overly complicated, just done many many many times so the function itself is pretty well optimized.