I am writing a webscraper in perl. I am having troubles trying to extract what I want from the data that is returned in the get("url"); function. I want find a particular line with a regex and then use another regex to match and store the matches in an array. If someone could give me an example that would be super helpful.
#!/usr/bin/perl
use LWP::Simple;
$regex = m/Prerequisite:.[A-Z]{4}[0-9]{4}/g;
$regex2 = m/[A-Z]{4}[0-9]{4}/g;
$content = $ARGV[0];
#print $content;
$urlundergrad = "http://www.handbook.unsw.edu.au/undergraduate/courses/2014/$content.html";
$urlpostgrad = "http://www.handbook.unsw.edu.au/postgraduate/courses/2014/$content.html";
if ( @ARGV = 1 ) {
$pageU = get("$urlundergrad") or die "unable to retrieve";
#$pageP = get("$urlPostgrad") or die "unable to retrieve";
foreach $line ( split( "\n", $pageU ) ) {
if ( $line =~ $regex ) {
push( @courses, $line );
}
}
print @courses;
print "\n";
} else {
print "usage: prereq.pl <UNSW course>";
}