1

I would like several processes running in parallel to read and write to the same numpy array. To avoid problems, where two processes try to read/write to the same memory, I need to protect the file I am writing to. How do I do that?

I assume that np.savetxt does not protect the file. I have tried the library portalocker. But by opening a file and locking it, np.savetxt is not allowed to write to the file.

MunHo
  • 81
  • 1
  • 1
  • 7
  • Reading from the same file is not a problem. Why do they need to write to the same file? – ErikR Dec 12 '14 at 07:10
  • @user5402 to coordinate their work. – MunHo Dec 12 '14 at 08:30
  • You can organize the parallelism so each process writes its results to a different file. For example, see Fork-join parallelism [(link)](http://en.wikipedia.org/wiki/Fork%E2%80%93join_model). – ErikR Dec 12 '14 at 09:06
  • Maybe something like that is easier. I want a process to see what values in an array have not been computed yet, so that it can start a computation that has not been run already. – MunHo Dec 12 '14 at 10:28

1 Answers1

0

See this question "Downloading over 1000 files in python" (link) for examples of using a worker thread pool.

Basically you split up all of the work beforehand, put the work into a queue and let a pool of worker threads process each piece of work. The workers put the results onto a another queue which can be processed by another thread to put all of the pieces together.

Community
  • 1
  • 1
ErikR
  • 51,541
  • 9
  • 73
  • 124