17

Is it possible to write a script in Perl that opens different URLs and saves a screenshot of each of them?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
fixxxer
  • 15,568
  • 15
  • 58
  • 76

5 Answers5

25

You could use WWW::Mechanize::Firefox to control a Firefox instance and dump the rendered page with $mech->content_as_png.

Be aware that setting it up can pose quite a challenge, though.

If all works as expected, you can simply use a script like this to dump images of the desired websites, but you should start Firefox and resize it to the desired width manually (height doesn't matter, WWW::Mechanize::Firefox always dumps the whole page).

use WWW::Mechanize::Firefox;
use Path::Class qw/file/;

my $mech = WWW::Mechanize::Firefox->new(
  bufsize => 10_000_000, # PNGs might become huge
);
$mech->get('http://www.stackoverflow.com/');

my $fh = file( 'test.png' )->open( '> :raw' );
print $fh $mech->content_as_png();
Connor Gurney
  • 597
  • 2
  • 6
  • 16
willert
  • 962
  • 9
  • 12
  • I wrote about this module for PerlTricks: [Controlling Firefox from Perl](http://perltricks.com/article/138/2014/12/8/Controlling-Firefox-from-Perl) – brian d foy Oct 27 '15 at 18:07
9

Use the WWW::Selenium module, for which you'll need to have a Selenium Remote Control session up and running.

The capture_entire_page_screenshot() method should get you up and running.

From WWW::Selenium on CPAN:

$sel->capture_entire_page_screenshot($filename, $kwargs)

Saves the entire contents of the current window canvas to a PNG file...


A typical script:

use strict;
use warnings;
use WWW::Selenium;

my $sel = WWW::Selenium->new( host => "localhost", 
                              port => 4444, 
                              browser => "*iexplore", 
                              browser_url => "http://www.google.com",
                            );

$sel->start;
$sel->open("http://www.google.com");
$sel->capture_entire_page_screenshot("screenshot.png");
$sel->close;
Zaid
  • 36,680
  • 16
  • 86
  • 155
8

Another approach, which doesn't require the use of a browser, is to use ImageMagick and HTML2PS to convert the image. Be warned however, this isn't trivial, and it's near impossible (last I tried) to get this working on Windows properly.

Once ImageMagick is installed, the simplest approach is to just run a system call to the convert program that ImageMagick installs. If you want a less hackish approach, you can use the PerlMagick ImageMagick API.

There is an excellent discussion on this approach you can find on PerlMonks.

Weegee
  • 2,225
  • 1
  • 17
  • 16
4

You could also use Win32::IE::Mechanize to render the web page using IE, and then Win32::Screenshot to capture the page. You'll probably have to do a bit of work to figure out where to take the screenshot, but that shouldn't be too incredibly hard.

This will be a Windows platform only solution, of course, but may suffice.

Robert P
  • 15,707
  • 10
  • 68
  • 112
  • Looks like Win32::IE::Mechanize is no longer working with Activestate and/or Windows 7: https://rt.cpan.org/Public/Dist/Display.html?Name=Win32-IE-Mechanize but this looks like it still works http://search.cpan.org/dist/Win32-IEAutomation-0.5/lib/Win32/IEAutomation.pm – Matthew Lock Aug 01 '12 at 03:04
3

Use a third-party web service API like http://webshotspro.com/ (screenshots) or http://www.thumbalizr.com/ (thumbnails).

Anon Gordon
  • 2,469
  • 4
  • 28
  • 34