I'm using python 2.7.10. I read lots of files, store them into a big list, then try to call multiprocessing and pass the big list to those multiprocesses so that each process can access this big list and do some calculation.
I'm using Pool like this:
def read_match_wrapper(args):
args2 = args[0] + (args[1],)
read_match(*args2)
pool = multiprocessing.Pool(processes=10)
result=pool.map(read_match_wrapper,itertools.izip(itertools.repeat((ped_list,chr_map,combined_id_to_id,chr)),range(10)))
pool.close()
pool.join()
Basically, I'm passing multiple variables to 'read_match' function. In order to use pool.map, I write 'read_match_wrapper' function. I don't need any results back from those processes. I just want them to run and finish.
I can get this whole process work when my data list 'ped_list' is quite small. When I load all the data, like 10G, then all the multiprocesses that it generates show 'S' and seems not working at all..
I don't know if there is a limit of how much data you can access through pool? I really need help on this! Thanks!