8

I am trying to implement code in parallel using Julia. I'm using the @everywhere macro in order to make all processes fetch data from a RemoteRef.

Is it possible to use a variable name thats only defined on the first process in the @everywhere expression and somehow specify that I want it to send the value of that variable, and not the variable name, to all processes?

Example:

r = RemoteRef()
put(r, data)
@everywhere data = fetch(r)

This returns an error because r is not defined on all processes.

How should I move data to all processes?

Also, can I tell Julia to put the value instead of the variable name in the expression? Something akin to how name = "John"; println("Hello, $name") will print "Hello, John"

mehdi lotfi
  • 11,194
  • 18
  • 82
  • 128
stords
  • 270
  • 2
  • 9
  • Still have an issue with this? I'm sure you can do what you want, but there are multiple questions and I'm not sure which/how to address them. – rickhg12hs May 14 '14 at 23:02
  • I think I've found a solution by making the data an argument of a function that I can @spawnat various processes, but its not a great solution. Any help would be much appreciated. What do I need to tell you? – stords May 15 '14 at 03:46
  • That's great you see how the data can be made an argument for a `@spawnat`. If it's okay to replicate your data everywhere, there are other options ... and if you're on Linux, you can use `SharedArray` too. Need to keep data movement to a minimum to be efficient. Maybe if you wrote a bit about your "big picture", the parallelization options might be a bit clearer. – rickhg12hs May 15 '14 at 08:23
  • Im trying to make a parallel restricted boltzman machine. Each process owns a portion of the training examples. I want to send every process the matrix of weights, allow them to determine an update to the weights based on their training examples, and send that update back to the main process. The main process then calculates the new weights matrix and sends it back to all the other processes so they can train again. – stords May 15 '14 at 20:06
  • I should add that the training step takes a lot of time (an hour or so), so I'm not too concerned about how much time it takes to move the weights matrix around. – stords May 15 '14 at 20:07
  • So, every worker receives the same weight matrix, does some training with its unique training set, and then returns an updated weights matrix (same size/type). Is the training set partitioned arbitrarily amongst the workers, or is there some criteria? – rickhg12hs May 16 '14 at 04:08
  • Thats right. The training set is arbitrarily partitioned. The only criteria is that each process has roughly the same number of training examples. But the partitioning happens at the very beginning, while reading all of the data from the hard disk. Each process only ever sees its own training set. The weight matrix is the only thing that moves around between processes. – stords May 17 '14 at 04:31
  • Given what I think you want to do, and my current Julia understanding, I think I'd probably use a `DArray` for the training data, and then some explicit `remotecall_fetch`'s to task each proc with a new weight matrix. If you wanted to use some meta-programming, you could even `remotecall` to the workers so that they `remotecall_fetch` from the main proc all the data they need (might be confusing to read though). – rickhg12hs May 17 '14 at 07:20
  • Whats the proper way of sending data to another process in julia? – bdeonovic Dec 28 '14 at 05:24
  • 2
    I just answered a related question [here](http://stackoverflow.com/questions/27677399/julia-how-to-copy-data-to-another-processor-in-julia/27724240#27724240) I think the functions I posted should help you do what you need to. – spencerlyon2 Dec 31 '14 at 17:25

1 Answers1

0

To find the functions (and macros) Spencer pointed in a nice little package, checkout ParallelDataTransfer.jl. The tests are good examples of usage (and the CI shows that these tests pass on v0.5 on all platforms).

For your problem, you can use the sendto function:

z = randn(10, 10); sendto(workers(), z=z)
@everywhere println(z)
Chris Rackauckas
  • 18,645
  • 3
  • 50
  • 81