2

I want to parallelize execution of a for loop on the quadcore processor of my computer's CPU. I am using pp (Python-Parallel) - rather than joblib.Parallel for reasons considered here.

But I am getting an error:

Traceback (most recent call last):
  File "batching.py", line 60, in cleave_out_bad_data
    job1 = job_server.submit(cleave_out, (data_dir,dirlist,), (endswithdat,))
  File "/homes/ad6813/.local/lib/python2.7/site-packages/pp.py", line 459, in submit
    sfunc = self.__dumpsfunc((func, ) + depfuncs, modules)
  File "/homes/ad6813/.local/lib/python2.7/site-packages/pp.py", line 637, in __dumpsfunc
    sources = [self.__get_source(func) for func in funcs]
  File "/homes/ad6813/.local/lib/python2.7/site-packages/pp.py", line 704, in __get_source
    sourcelines = inspect.getsourcelines(func)[0]
  File "/usr/lib/python2.7/inspect.py", line 690, in getsourcelines
    lines, lnum = findsource(object)
  File "/usr/lib/python2.7/inspect.py", line 529, in findsource
    raise IOError('source code not available')
IOError: source code not available

It looks like the reason is a python-2.7 bug.

Has anyone come across this and solved it?


Here is my code:

def clean_dir(data_dir, dirlist):
  job_server = pp.Server()
  job1 = job_server.submit(clean, (data_dir,dirlist,), (endswith,))

def clean(data_dir, dirlist):
  [good_or_bad(file, data_dir) for file in dirlist if endswith(file)]
Community
  • 1
  • 1
Alexandre Holden Daly
  • 6,944
  • 5
  • 25
  • 36

1 Answers1

0

Inspired by here, the way I fixed a similar problem is to save the code and function in one file like "test.py" and call this file with python, rather than input the function line by line in python shell. I works for me.