16

I'm working on a mathematical model that uses data generated from XFOIL, a popular aerospace tool used to find the lift and drag coefficients on airfoils.

I have a Perl script that calls XFOIL repeatedly with different input parameters to generate the data I need. I need XFOIL to run 5,600 times, at around 100 seconds per run, soabout 6.5 days to complete.

I have a quad-core machine, but my experience as a programmer is limited, and I really only know how to use basic Perl.

I would like to run four instances of XFOIL at a time, all on their own core. Something like this:

while ( 1 ) {

    for ( i = 1..4 ) {

        if ( ! exists XFOIL_instance(i) ) {

            start_new_XFOIL_instance(i, input_parameter_list);
        }
    }
} 

So the program is checking (or preferably sleeping) until an XFOIL instance is free, when we can start a new instance with the new input parameter list.

zdim
  • 64,580
  • 5
  • 52
  • 81
Dang Khoa
  • 5,693
  • 8
  • 51
  • 80
  • 1
    I'm afraid I'm not going to provide a full answer, but the short version is that you can definitely fork off four instances of the current perl script, then have each constantly shell out to run an XFOIL script. However, setting the processor affinity for the resulting processes -- that would require knowing what operating system you're using. – Conspicuous Compiler Dec 25 '09 at 18:53
  • 3
    Are you sure XFOIL doesn't thread or otherwise use multiple processor to get it's run time to about 100 seconds in the first place? – dlamblin Dec 25 '09 at 19:18
  • Would it be hard to implement XFOIL into C/Fortran? If no, then I would suggest you go for it. Perl is not exactly the Speedy Gonzalez of programming languages... – Zaid Dec 25 '09 at 19:43
  • Thanks for the comments so far guys. @Conspicuous Compiler: I'm running Ubuntu 9.10. @dlamblin: Checking the System Monitor shows that only 1 core is being used for XFOIL. @Zaid: XFOIL is written in FORTRAN. The Perl script just makes a system() call to it. @Idigas: See above comments. Also note that it is very fast for a typical range of AOA (+/-10), but my project has a typical AOA swing of +/-40. – Dang Khoa Dec 25 '09 at 20:14
  • If you spawn a couple of child processes, the OS itself will schedule them among CPUs for you for free. – el.pescado - нет войне Dec 27 '09 at 22:58

5 Answers5

17

Try Parallel::ForkManager. It's a module that provides a simple interface for forking off processes like this.

Here's some example code:

#!/usr/bin/perl

use strict;
use warnings;
use Parallel::ForkManager;

my @input_parameter_list = 
    map { join '_', ('param', $_) }
    ( 1 .. 15 );

my $n_processes = 4;
my $pm = Parallel::ForkManager->new( $n_processes );
for my $i ( 1 .. $n_processes ) {
    $pm->start and next;

    my $count = 0;
    foreach my $param_set (@input_parameter_list) {         
        $count++;
        if ( ( $count % $i ) == 0 ) {
            if ( !output_exists($param_set) ) {
                start_new_XFOIL_instance($param_set);
            }
        }
    }

    $pm->finish;
}
$pm->wait_all_children;

sub output_exists {
    my $param_set = shift;
    return ( -f "$param_set.out" );
}

sub start_new_XFOIL_instance {
    my $param_set = shift;
    print "starting XFOIL instance with parameters $param_set!\n";
    sleep( 5 );
    touch( "$param_set.out" );
    print "finished run with parameters $param_set!\n";
}

sub touch {
    my $fn = shift;
    open FILE, ">$fn" or die $!;
    close FILE or die $!;
}

You'll need to supply your own implementations for the start_new_XFOIL_instance and the output_exists functions, and you'll also want to define your own sets of parameters to pass to XFOIL.

James Thompson
  • 46,512
  • 18
  • 65
  • 82
  • 1
    This looks to be what I need. I will read up on Parallel::ForkManager and let you know how it goes. Thanks for the help! Of course, any other input from anyone else is appreciated. – Dang Khoa Dec 25 '09 at 20:10
  • If you didn't already know, you can install the Parallel::ForkManager module in your home directory. Look here for how to do so: http://stackoverflow.com/questions/540640/how-can-i-install-a-cpan-module-into-a-local-directory – James Thompson Dec 26 '09 at 06:30
  • 1
    James, thanks very much for your help. I installed Parallel::ForkManager via command line a little bit ago - I think I'm up and running now. I'm still trying to figure out the intricacies of the module as well as how I want it to behave in error conditions, but a preliminary run on my dual-core laptop leads me to think I've figured this out - at least the basic idea, anyway. Thanks a bunch again! – Dang Khoa Dec 26 '09 at 07:55
  • Does Parallel::ForkManager use multicore on Windows? I tried it on my Windows machine and it does use lots of threads. Forgive me ignorance but do threads of the same process run across multiple cores on Windows? – Matthew Lock Jul 25 '12 at 06:51
5

Perl threads will take advantage of multiple cores and processors. The main pro of threads is its fairly easy to share data between the threads and coordinate their activities. A forked process cannot easily return data to the parent nor coordinate amongst themselves.

The main cons of Perl threads is they are relatively expensive to create compared to a fork, they must copy the entire program and all its data; you must have them compiled into your Perl; and they can be buggy, the older the Perl, the buggier the threads. If your work is expensive, the creation time should not matter.

Here's an example of how you might do it with threads. There's many ways to do it, this one uses Thread::Queue to create a big list of work your worker threads can share. When the queue is empty, the threads exit. The main advantages are that its easier to control how many threads are active, and you don't have to create a new, expensive thread for each bit of work.

This example shoves all the work into the queue at once, but there's no reason you can't add to the queue as you go. If you were to do that, you'd use dequeue instead of dequeue_nb which will wait around for more input.

use strict;
use warnings;

use threads;
use Thread::Queue;

# Dummy work routine
sub start_XFOIL_instance {
    my $arg = shift;
    print "$arg\n";
    sleep 1;
}

# Read in dummy data
my @xfoil_args = <DATA>;
chomp @xfoil_args;

# Create a queue to push work onto and the threads to pull work from
# Populate it with all the data up front so threads can finish when
# the queue is exhausted.  Makes things simpler.
# See https://rt.cpan.org/Ticket/Display.html?id=79733
my $queue = Thread::Queue->new(@xfoil_args);

# Create a bunch of threads to do the work
my @threads;
for(1..4) {
    push @threads, threads->create( sub {
        # Pull work from the queue, don't wait if its empty
        while( my $xfoil_args = $queue->dequeue_nb ) {
            # Do the work
            start_XFOIL_instance($xfoil_args);
        }

        # Yell when the thread is done
        print "Queue empty\n";
    });
}

# Wait for threads to finish
$_->join for @threads;

__DATA__
blah
foo
bar
baz
biff
whatever
up
down
left
right
Schwern
  • 153,029
  • 25
  • 195
  • 336
  • I see my previous comment (or your previous answer) has been deleted, anyway thanks for updating your answer. I am curious about, if you verify that threads can take advantage of multiple cores and processors, if so, how did you verify it? Thanks =) – user454322 Sep 19 '12 at 02:31
  • @user454322 After seeing your comment, I wrote a little script to do an infinite loop in a bunch of threads and used Activity Monitor on OS X to see that all four cores were being used. You're right about the threading model being a new Perl interpreter per real thread. Previously I'd got it in my head that it was all emulated in a single process. – Schwern Sep 20 '12 at 02:13
  • I have posted http://stackoverflow.com/questions/12536064/how-does-perls-threading-system-work, if you get a chance please take a look. – user454322 Sep 21 '12 at 18:23
4

This looks like you can use gearman for this project.

www.gearman.org

Gearman is a job queue. You can split your work flow into a lot of mini parts.

I would recommend using amazon.com or even their auction able servers to complete this project.

Spending 10cents per computing hour or less, can significantly spead up your project.

I would use gearman locally, make sure you have a "perfect" run for 5-10 of your subjobs before handing it off to an amazon compute farm.

Daniel
  • 7,006
  • 7
  • 43
  • 49
0

Did you consider gnu parallel parallel. It will allow you to run several install instances of your program with different inputs and fill your CPU cores as they begin available. It's often a very simple an efficient way to achieve parallelization of simple tasks.

Aif
  • 11,015
  • 1
  • 30
  • 44
0

This is quite old but if someone is still looking for suitable answers to this question, you might want to consider Perl Many-Core-Engine (MCE)

ashraf
  • 537
  • 7
  • 16