2

I am having flowing scheme:

one data processing process - The_process few data producing processes - workers

what I need to do is to share list foo_list, containing two other large objects list1, dict2,

foo_list = [list1, dict2]

between those processes. Workers should only read from foo_list, but I need them to have consistent corresponding list1 and dict2. The_process should be able to modify those data.

Edit: I need workers to have updated version of foo_list. The_process once in a while updates foo_list and I need workers to start using that updated version as soon as possible.

I have used manager.list from multiprocessing library but profiling showed, that it takes about 25% of the program time, only to get data from the list by worker.

So the question is, is there any other way how to do it? or am I doing it wrong?

Jendas
  • 3,359
  • 3
  • 27
  • 55
  • Without more context nobody can give you an answer. Communication between processes is always a big overhead and you should reduce it to the minimum possible. – Bakuriu Jul 02 '13 at 09:10
  • 1
    Related: [Python: Possible to share in-memory data between 2 separate processes](http://stackoverflow.com/q/1268252/4279) – jfs Jul 02 '13 at 11:05

1 Answers1

2

Shared memory is in most cases not the way to go. You are better off (if you have the memory for it) with making copies of this list and passing a copy of foo_list to each process so that no time is wasted managing the list between processes.

I myself had a similar issue when I tried using shared memory -> Python multiprocessing performance

Community
  • 1
  • 1
Bas Jansen
  • 3,273
  • 5
  • 30
  • 66
  • I see... But what do you mean by making copies? A need the workers to have "updated" version of that list. The_process once in a while changes it or removes something from it and I need workers to start using that version as soon as possible. I will add this to the question – Jendas Jul 02 '13 at 09:17
  • I mean that if you have 4 workers, you can split the original list into 4 copies of that list (ie worker 1 gets `copy1 = foo_list[0:(0.25*len(foo_list)]` and have each worker use it's own version of the list. I am assuming here that your worker processes don't need access to all elements of the list because if they do.. that just ruins it for you. – Bas Jansen Jul 02 '13 at 09:23
  • Yes, unfortunately, they do :-( But thank you anyway – Jendas Jul 02 '13 at 09:27