See HTML::LinkExtor. There is no point wasting your life energy (nor ours) trying to use regular expressions for these types of tasks.
You can read the documentation for a Perl module installed on your computer by using the perldoc
utility. For example, perldoc HTML::LinkExtor
. Usually, module documentation begins with an example of how to use the module.
Here is a slightly more modern adaptation of one of the examples in the documentation:
#!/usr/bin/env perl
use v5.20;
use warnings;
use feature 'signatures';
no warnings 'experimental::signatures';
use autouse Carp => qw( croak );
use HTML::LinkExtor qw();
use HTTP::Tiny qw();
use URI qw();
run( $ARGV[0] );
sub run ( $url ) {
my @images;
my $parser = HTML::LinkExtor->new(
sub ( $tag, %attr ) {
return unless $tag eq 'img';
push @images, { %attr };
return;
}
);
my $response = HTTP::Tiny->new->get( $url, {
data_callback => sub { $parser->parse($_[0]) }
}
);
unless ( $response->{success} ) {
croak sprintf('%d: %s', $response->{status}, $response->{reason});
}
my $base = $response->{url};
for my $image ( @images ) {
say URI->new_abs( $image->{src}, $base )->as_string;
}
}
Output:
$ perl t.pl https://www.perl.com/
https://www.perl.com/images/site/perl-onion_20.png
https://www.perl.com/images/site/twitter_20.png
https://www.perl.com/images/site/rss_20.png
https://www.perl.com/images/site/github_light_20.png
https://www.perl.com/images/site/perl-camel.png
https://www.perl.com/images/site/perl-onion_20.png
https://www.perl.com/images/site/twitter_20.png
https://www.perl.com/images/site/rss_20.png
https://www.perl.com/images/site/github_light_20.png
https://i.creativecommons.org/l/by-nc/3.0/88x31.png