In brief, I have a (1000 x 500000) matrix which I need to sort row by row, ideally..parallel processing should work, but the multiprocessing module in python seems to make a copy of the entire matrix each time a process is spawned, leading to RAM overflow. How do I tackle this issue..?
def sort_parallel(n):
y[n].sort(key=lambda y:-y[1])
if __name__ == '__main__':
pool = Pool(processes=2)
pool.apply(sort_parallel,range(0,len(y)))
pool.close()
pool.join()
Following similar questions, have tried map,map_async, apply_async with no progress, the fundamental problem seems to be the copies of lists for each process.. which floods the RAM, which can possibly be prevented by read-only operation, but as I am doing in-place sorting..it doesn't help me. Also tried sorted() instead of sort(), still no solution in sight.