This is my first question so I apologise in advance if I format/ask it all wrong.
I am using Perl to extract a string from a file, submit a web form, and download a new file created by the web-page. The aim is to have it run for 30,000 files in a loop, which I estimate will take ~8 days. I am using WWW::Selenium and WWW::Mechanize to perform the web automation. The issue I have is that if for some reason a page doesn't load properly or the internet drops for a period of time then the script exits and gives an error message like(depending on which stage it failed at):
Error requesting http://localhost:4444/selenium-server/driver/:
ERROR: Could not find element attribute: link=Download PDB File@href
I would like the script to continue running, moving onto the next round of the loop so I don't have to worry if a single round of the loop throws an error. My research suggests that using Try::Tiny
may be the best solution. Currently I have the script below using only try{...}
which seems to suppress any error and allow the script to continue through the files. However I'm concerned that this seems to be a very blunt solution and provides me no insight into which/why files failed.
Ideally I would want to print the filename and error message for each occurence to another file that could then be reviewed once the script is complete but I am struggling to understand how to use catch{...}
to do this or if that is even the correct solution.
use strict;
use warnings;
use WWW::Selenium;
use WWW::Mechanize;
use Try::Tiny;
my @fastas = <*.fasta>;
foreach my $file (@fastas) {
try{
open(my $fh, "<", $file);
my $sequence;
my $id = substr($file, 0, -6);
while (my $line = <$fh>) {
## discard fasta header line
} elsif($line =~ /^>/) { # / (turn off wrong coloring)
next;
## keep line, add to sequence string
} else {
$sequence .= $line;
}
}
close ($fh);
my $sel = WWW::Selenium->new( host => "localhost",
port => 4444,
browser => "*firefox",
browser_url => "http://www.myurl.com",
);
$sel->start;
$sel->open("http://www.myurl.com");
$sel->type("chain1", $sequence);
$sel->type("chain2", "EVQLVESGPGLVQPGKSLRLSCVASGFTFSGYGMHWVRQAPGKGLEWIALIIYDESNKYYADSVKGRFTISRDNSKNTLYLQMSSLRAEDTAVFYCAKVKFYDPTAPNDYWGQGTLVTVSS");
$sel->click("css=input.btn.btn-success");
$sel->wait_for_page_to_load("30000");
## Wait through the holding page - will timeout after 5 mins
$sel->wait_for_element_present("link=Download PDB File", "300000");
## Get the filename part of link
$sel->wait_for_page_to_load("30000");
my $pdbName = $sel->get_attribute("link=Download PDB File\@href");
## Concatenate it with the main domain
my $link = "http://www.myurl.com/" . $pdbName;
$sel->stop;
my $mech = WWW::Mechanize->new( autocheck => 1 );
$mech -> get($link);
#print $mech -> content();
$mech -> save_content($id . ".pdb");
};
}