I am running a Monte carlo on Multiple processors, but it hangs up a lot. So I put together this perl code to kill the iteration that hangs up the monte carlo and go to the next iteration. But I get some errors, I have not figure out yet. I think it sleeps too long and it will delete the out.mt0 file before it will look for it. This is the code:
my $pid = fork();
die "Could not fork\n" if not defined $pid;
if ($pid == 0) {
print "In child\n";
system("hspice -i mont_read.sp -o out -mt 4"); wait;
sleep(.8); wait;
exit(0);
}
print "In parent \n";
$i = 0;
$mont_number = $j - 1;
out: while (1) {
$res = waitpid($pid, WNOHANG);
if ($res == -1) {
print "Successful Exit Process Detected\n";
system("mv out.mt0 mont_read.mt0"); wait;
sleep(1); wait;
system("perl monte_stat.pl > rel_out.txt"); wait ;
system("cat stat_result.txt rel_out.txt > stat_result.tmp"); wait;
system("mv stat_result.tmp stat_result.txt"); wait;
print "\nSim #$mont_number complete\n"; wait;
last out;
}
if ($res != -1) {
if ($i >= $timeout) {
$hang_count = $hang_count+1;
system("killall hspice"); wait;
sleep(1);
print("time_out complete\n"); wait;
print "\nSim #$mont_number complete\n"; wait;
last out;
}
if ($i < $timeout) {
sleep $slept; wait;
}
$i = $i+1;
}
}
This is the error:
Illegal division by zero at monte_stat.pl line 73, line 2. mv: cannot stat `out.mt0': No such file or directory Illegal division by zero at monte_stat.pl line 73, line 1. mv: cannot stat `out.mt0': No such file or directory Illegal division by zero at monte_stat.pl line 73, line 1. mv: cannot stat `out.mt0': No such file or directory Illegal division by zero at monte_stat.pl line 73. mv: cannot stat `out.mt0': No such file or directory Illegal division by zero at monte_stat.pl line 73. mv: cannot stat `out.mt0': No such file or directory mv: cannot stat `out.mt0': No such file or directory mv: cannot stat `out.mt0': No such file or directory Illegal division by zero at monte_stat.pl line 73, line 3. mv: cannot stat `out.mt0': No such file or directory Illegal division by zero at monte_stat.pl line 73, line 1. mv: cannot stat `out.mt0': No such file or directory
Could anyone give me an idea where to look to debug it. thanks