2

I am using GLPK with Julia, and using the methods written by spencerlyon

sendto(2, lp = lp) #lp is type GLPK.Prob

However, I cant seem to send a type GLPK.Prob between workers. Whenever I do try to send a type GLPK.Prob, it gets 'sent' and calling

remotecall_fetch(2, whos)

confirms that the GLPK.Prob got sent

The problem appears when I try to solve it by calling

simplex(lp)

the error

GLPK.GLPKError("invalid GLPK.Prob")

appears. I know that the GLPK.Prob isnt originally an invalid GLPK.Prob and if I decide to construct the GLPK.Prob type explicitly on another worker, fx worker 2, calling simplex runs just fine

This is a problem as the GLPK.Prob is generated from a custom type of mine that is a bit on the heavy side

tl;dr Are there possibly some types that cannot be sent between workers properly?

Update

I see now that calling

remotecall_fetch(2, simplex, lp)

will return the above GLPK error

Furthermore I've just noticed that the GLPK module has got a method called

GLPK.copy_prob(GLPK.Prob, GLPK.Prob, Int)

but deepcopy (and certainly not copy) wont work when copying a GLPK.Prob

Example

function create_lp()
    lp = GLPK.Prob()

    GLPK.set_prob_name(lp, "sample")
    GLPK.term_out(GLPK.OFF)

    GLPK.set_obj_dir(lp, GLPK.MAX)

    GLPK.add_rows(lp, 3)
    GLPK.set_row_bnds(lp,1,GLPK.UP,0,100)
    GLPK.set_row_bnds(lp,2,GLPK.UP,0,600)
    GLPK.set_row_bnds(lp,3,GLPK.UP,0,300)

    GLPK.add_cols(lp, 3)

    GLPK.set_col_bnds(lp,1,GLPK.LO,0,0)
    GLPK.set_obj_coef(lp,1,10)
    GLPK.set_col_bnds(lp,2,GLPK.LO,0,0)
    GLPK.set_obj_coef(lp,2,6)
    GLPK.set_col_bnds(lp,3,GLPK.LO,0,0)
    GLPK.set_obj_coef(lp,3,4)

    s = spzeros(3,3)
    s[1,1] =  1
    s[1,2] =  1
    s[1,3] =  1
    s[2,1] =  10
    s[3,1] =  2
    s[2,2] =  4
    s[3,2] =  2
    s[2,3] =  5
    s[3,3] =  6

    GLPK.load_matrix(lp, s)

    return lp
end 

This will return a lp::GLPK.Prob() which will return 733.33 when running

simplex(lp)
result = get_obj_val(lp)#returns 733.33

However, doing

addprocs(1)
remotecall_fetch(2, simplex, lp)

will result in the error above

Community
  • 1
  • 1
isebarn
  • 3,812
  • 5
  • 22
  • 38

1 Answers1

1

It looks like the problem is that your lp object contains a pointer.

julia> lp = create_lp()
GLPK.Prob(Ptr{Void} @0x00007fa73b1eb330)

Unfortunately, working with pointers and parallel processing is difficult - if different processes have different memory spaces then it won't be clear which memory address the process should look at in order to access the memory that the pointer points to. These issues can be overcome, but apparently they require individual work for each data type that involves said pointers, see this GitHub discussion for more.

Thus, my thought would be that if you want to access the pointer on the worker, you could just create it on that worker. E.g.

using GLPK
addprocs(2)

@everywhere begin
    using GLPK
    function create_lp()
        lp = GLPK.Prob()

        GLPK.set_prob_name(lp, "sample")
        GLPK.term_out(GLPK.OFF)

        GLPK.set_obj_dir(lp, GLPK.MAX)

        GLPK.add_rows(lp, 3)
        GLPK.set_row_bnds(lp,1,GLPK.UP,0,100)
        GLPK.set_row_bnds(lp,2,GLPK.UP,0,600)
        GLPK.set_row_bnds(lp,3,GLPK.UP,0,300)

        GLPK.add_cols(lp, 3)

        GLPK.set_col_bnds(lp,1,GLPK.LO,0,0)
        GLPK.set_obj_coef(lp,1,10)
        GLPK.set_col_bnds(lp,2,GLPK.LO,0,0)
        GLPK.set_obj_coef(lp,2,6)
        GLPK.set_col_bnds(lp,3,GLPK.LO,0,0)
        GLPK.set_obj_coef(lp,3,4)

        s = spzeros(3,3)
        s[1,1] =  1
        s[1,2] =  1
        s[1,3] =  1
        s[2,1] =  10
        s[3,1] =  2
        s[2,2] =  4
        s[3,2] =  2
        s[2,3] =  5
        s[3,3] =  6

        GLPK.load_matrix(lp, s)

        return lp
    end 
end

a = @spawnat 2 eval(:(lp = create_lp()))
b = @spawnat 2 eval(:(result = simplex(lp)))
fetch(b)

See the documentation below on @spawn for more info on using it, as it can take a bit of getting used to.



The macros @spawn and @spawnat are two of the tools that Julia makes available to assign tasks to workers. Here is an example:

julia> @spawnat 2 println("hello world")
RemoteRef{Channel{Any}}(2,1,3)

julia>  From worker 2:  hello world

Both of these macros will evaluate an expression on a worker process. The only difference between the two is that @spawnat allows you to choose which worker will evaluate the expression (in the example above worker 2 is specified) whereas with @spawn a worker will be automatically chosen, based on availability.

In the above example, we simply had worker 2 execute the println function. There was nothing of interest to return or retrieve from this. Often, however, the expression we sent to the worker will yield something we wish to retrieve. Notice in the example above, when we called @spawnat, before we got the printout from worker 2, we saw the following:

RemoteRef{Channel{Any}}(2,1,3)

This indicates that the @spawnat macro will return a RemoteRef type object. This object in turn will contain the return values from our expression that is sent to the worker. If we want to retrieve those values, we can first assign the RemoteRef that @spawnat returns to an object and then, and then use the fetch() function which operates on a RemoteRef type object, to retrieve the results stored from an evaluation performed on a worker.

julia> result = @spawnat 2 2 + 5
RemoteRef{Channel{Any}}(2,1,26)

julia> fetch(result)
7

The key to being able to use @spawn effectively is understanding the nature behind the expressions that it operates on. Using @spawn to send commands to workers is slightly more complicated than just typing directly what you would type if you were running an "interpreter" on one of the workers or executing code natively on them. For instance, suppose we wished to use @spawnat to assign a value to a variable on a worker. We might try:

@spawnat 2 a = 5
RemoteRef{Channel{Any}}(2,1,2)

Did it work? Well, let's see by having worker 2 try to print a.

julia> @spawnat 2 println(a)
RemoteRef{Channel{Any}}(2,1,4)

julia> 

Nothing happened. Why? We can investigate this more by using fetch() as above. fetch() can be very handy because it will retrieve not just successful results but also error messages as well. Without it, we might not even know that something has gone wrong.

julia> result = @spawnat 2 println(a)
RemoteRef{Channel{Any}}(2,1,5)

julia> fetch(result)
ERROR: On worker 2:
UndefVarError: a not defined

The error message says that a is not defined on worker 2. But why is this? The reason is that we need to wrap our assignment operation into an expression that we then use @spawn to tell the worker to evaluate. Below is an example, with explanation following:

julia> @spawnat 2 eval(:(a = 2))
RemoteRef{Channel{Any}}(2,1,7)

julia> @spawnat 2 println(a)
RemoteRef{Channel{Any}}(2,1,8)

julia>  From worker 2:  2

The :() syntax is what Julia uses to designate expressions. We then use the eval() function in Julia, which evaluates an expression, and we use the @spawnat macro to instruct that the expression be evaluated on worker 2.

We could also achieve the same result as:

julia> @spawnat(2, eval(parse("c = 5")))
RemoteRef{Channel{Any}}(2,1,9)

julia> @spawnat 2 println(c)
RemoteRef{Channel{Any}}(2,1,10)

julia>  From worker 2:  5

This example demonstrates two additional notions. First, we see that we can also create an expression using the parse() function called on a string. Secondly, we see that we can use parentheses when calling @spawnat, in situations where this might make our syntax more clear and manageable.

Graham
  • 7,431
  • 18
  • 59
  • 84
Michael Ohlrogge
  • 10,559
  • 5
  • 48
  • 76
  • Thanks, that's close to what I wound up doing. I fetch every relevant array from the lp and transfer it over and have a create_lp() method defined on every worker which creates a lp. I'd still like a way to use the copy_prob method and be able to 'move' the pointer (if that is something you can do) over to another worker, as there is always that search for elegance – isebarn Aug 03 '16 at 13:59
  • @isebarn Sure thing. Also, note that when you use `remotecall_fetch()` and then supply arguments to the function, the objects that you specify will by default come from the scope of the process calling `remotecall_fetch()` and then those objects will get sent to the process being activated, rather than `remotecall` natively using arguments in the scope of the worker process. This can also then lead to these kinds of serialization errors. – Michael Ohlrogge Aug 03 '16 at 14:33