4

I have been recently transfer my code in Julia. I'm wondering how to exectute Julia code in command line?

I know the Julia code can be complied by running it once.

But the thing is I need to do parameter sweep for my simulation models on the cluster, where I could only use command line -- not the REPL.

What is the best practice to run simulation replications on the cluster?

Frames Catherine White
  • 27,368
  • 21
  • 87
  • 137
Yifei
  • 93
  • 1
  • 5
  • whatdo you mean by "only use the commandline"? do you not have the `julia` executable on the clustermachines? – Frames Catherine White Feb 04 '17 at 00:04
  • No, I can run julia executable on the cluster. I can also use the head node to run julia code in the REPL, but I need to write PBS job script to run massive simulations on the cluster. That is what I meant by 'only use the command line'. Hope this clarifies. – Yifei Feb 06 '17 at 14:16

5 Answers5

6

Just call your script using the command line:

julia myscript.jl

But the thing is I need to do parameter sweep for my simulation models on the cluster, where I could only use command line.

I think it's easiest to use Julia's built-in parallelism. pmap usually does the trick. If you're solving differential equations, DifferentialEquations.jl has a function which will parallelize your problem across a cluster, and its internal implementation uses pmap. That can serve as a good reference for how to handle other problems as well.

Then all you have to do is call Julia so that way it has access to all cores. You can easily do this by passing in the machinefile:

julia myscript.jl --machinefile the_machine_file

The machine file is generated whenever you create a batch job (for some clusters, sometimes you need to enable MPI for the machine file to show up). For more information, see this blog post.

Chris Rackauckas
  • 18,645
  • 3
  • 50
  • 81
  • Thanks. We are using PBS job scheduler. I guess the machinefile will be stored in $PBS_NODEFILE. So I just run `julia myscript.jl --machinefile $PBS_NODEFILE`? – Yifei Feb 06 '17 at 15:56
3

Forgot to mention that I've managed to run Julia from command line on the cluster.

In the PBS job script, you can add julia run_mytest.jl $parameter. In the run_mytest.jl, you can add

include("mytest.jl")
arg = parse(Float64, ARGS[1])
mytest(arg)
Yifei
  • 93
  • 1
  • 5
1

Julia uses JIT compilation independent of whether or not you execute Julia at the command line or in the REPL or on a compute cluster.

Is it problematic to run your code once to compile and once more for performance? You can always compile your code using a tiny model or dataset and then run the compiled code on your complete dataset.

If you run on one node, then you can write a function (e.g. my_sim()) containing all of your execution code, and then run your replications in serial as one scheduled job. The first call to my_sim() compiles all of your code, and the subsequent calls run faster.

If you run on multiple nodes, then carefully consider how to distribute jobs; perhaps you can test your parameter settings in groups, and assign each group to its own node, and then do my_sim() on each node.

Kevin L. Keys
  • 997
  • 8
  • 21
  • 1
    Thanks, I got your points. Another thing is suppose I am running parallel Julia code, which will use multiple nodes for each one of replications. If I run `my_sim()` once, will all the nodes run the complied `my_sim()` in the subsequent replications? – Yifei Feb 06 '17 at 16:08
  • Also, if the `my_sim()` includes several sub-functions, will all sub-functions be complied if I just run `my_sim()`, or should I run each sub-function to get it complied? I guess not. – Yifei Feb 06 '17 at 16:23
  • for your first question: I am not sure, but I think that the current Julia parallel framework means that each worker needs to compile the code on the first run. if each node is running one process -- or several processes with one master and several workers -- then each process will need to compile the code once. for your second question: assuming that `my_sim()` includes all of your execution code, then running it once compiles all subroutines that `my_sim()` calls. you do not need to run the subroutines beforehand. – Kevin L. Keys Feb 06 '17 at 17:47
0

Assuming what you are trying to achieve is the following:

  • Have one .jl-File containing the code and a shebang (#!/usr/bin/env julia, or similar) at the top.
  • Have another program, bash, etc. call upon this code (e.g. in bash by calling ./mycode.jl).
  • But avoid going through the compilation step for each time the code is called because it creates significant overhead.

Answer:

As others have pointed out, I would think the most julia-nique way of doing this would actually be to do the looping over parameters/distribution of workloads/etc. all within julia. But if you want to do it as described above you can use the following little trick:

  • Extract all code that has to be compiled into a module.
  • The actual file to be called thus reduces to

#!/usr/bin/env julia

using mymodule

mymainfunction(ARGS)

  • make the module precompiled by adding __precompile__() to the module file (see the Julia Manpages for more on this)

This way, after having called the code once per machine, precompiled objects are available, reducing the aforementioned overhead effectively to zero.

Community
  • 1
  • 1
  • 3
    `__precompile__()` only precompiles the AST if I understand correctly. If you want to take this approach and actually get "overhead effectively to zero", you want to add the module to the userimg.jl. Even then, you will still pay JIT/Julia startup time for each process, so this is still not good if you have many small runs. – Chris Rackauckas Feb 04 '17 at 18:22
  • @Chris Rackauckas Thanks for bringing up the startup time. A very valid point. I have just done a bit of timing on my desktop machine with some of my codes: For most of them (about 500 lines of julia each) total call time after precompilation is in the millisecond range. But indeed each startup takes a bit over a hundred milliseconds, which is rather significant and probably undermines my suggestion. – vonDonnerstein Feb 04 '17 at 19:11
  • It all depends on the problem. – Chris Rackauckas Feb 04 '17 at 19:12
  • @Chris Rackauckas Can you be more specific on how to add the module to the userimg.jl? Should the command like `julia --sysimage myimg.jl --precompiled=yes mytest.jl`? – Yifei Feb 06 '17 at 16:42
  • @ChrisRackauckas It is not true that `__precompile__()`only precompiles the AST. I am not sure where this misconception comes from. `__precompile__()` executes all the code and stores the result. – Fengyang Wang Feb 06 '17 at 19:42
  • Then how come there's a measurable compilation time on the first call even if you precompile? I thought that was the explanation. – Chris Rackauckas Feb 06 '17 at 19:43
  • As far as I understand, precompile cannot always work out exactly which types a given function will be compiled for. The first time a function is used with a given type at run time, it will still have to be compiled. – David P. Sanders Apr 21 '17 at 12:08
0

Please find below the best practices to run parameter sweep on a Julia HPC cluster. I discuss three issues: (1) computational simulation architecture (2) cluster setup (3) pre-compilation.

  1. Planning simulation architecture, in the first step consider the variance of computation time for each sweep value

    • if the variance of computation time is low you are OK with the suggested pmap. Another good alternative is the @parallel loop.
    • however, if the variance of computational time is high, using those options is not recommended. pmap and @parallel simply divide the tasks equally across all the workers. Hence, the execution time would be the time the longest worker took to complete all jobs that it has been assigned.

    Hence, for heterogeneous computation times you need:

    • store job number (or parameter sweep values) on the master process
    • launch loops on slave processes using @spawnat (simply iterate over workers())
    • have slave processes poll the master for the next parameter sweep value using ParallelDataTransfer.jl (of course some external database can be used for this purpose instead).
  2. On HPC environments the best choice for cluster setup is ClusterManagers.jl - works like a charm and the PBS that you mentioned is supported. This library will execute appropriate PBS cluster manager commands to add nodes to your Julia cluster. Simple, efficient and easy to use. The suggested by others --machinefile option is very convenient but requires a passwordless SSH that is usually not available (or not easily configurable) on most HPC clusters (unless it is a cluster in public cloud - for AWS or Azure I would definitely recommend --machinefile).

    Please note that in some HPC clusters (e.g. Cray) you might need to build Julia separately for the access and worker nodes due to different hardware architectures. Fortunately, Julia parallelization works without any problems in heterogeneous environments.

    Last but not least, you may always use your cluster manager to run separate Julia processes (grid computing/array computing job). This however, becomes complicated if computation times are heterogeneous (see the comments previous point).

  3. I would not recommend pre-compiling. In most numerical simulation scenarios a single process will run anywhere between 10 minutes and a few days. Reducing this by 10-20 seconds of compilation time is not worth the effort. However, the instructions are below:

    The steps include:

    1. Create yourimage.jl file with content such as Base.require(:MyModule1) Base.require(:MyModule2)
    2. Run $ julia /path/to/julia/Julia/share/julia/build_sysimg.jl /target/image/folder native yourimage.jl
    3. Wait for message similar to this one INFO: System image successfully built at /target/image/folder/yourimage.so INFO: To run Julia with this image loaded, run: `julia -J /target/image/folder/yourimage.so`.
    4. Follow the instructions and run Julia with the -J option

    You need to repeat the above four steps every time something in your own or external packages changes.

Przemyslaw Szufel
  • 40,002
  • 3
  • 32
  • 62