I normally use WGET to download an image or two from some web-page, I do something like this from the command prompt: wget 'webpage-url' -P 'directory to where I wanna save it'
. Now how do I automate it in Perl and Python? That is what command shall enable me to simulate as if I am entering the command at the command-prompt? In Python there are so many similar looking modules like subprocess, os, etc that I am quite confused.
Asked
Active
Viewed 650 times
1
3 Answers
8
In Perl, the easiest way is to use LWP::Simple
:
use LWP::Simple qw(getstore);
getstore('www.example.com', '/path/to/saved/file.ext');

s0me0ne
- 504
- 2
- 9
4
import subprocess
subprocess.call(["wget", "www.example.com", "-P", "/dir/to/save"])
If you want to read URL and process the response:
import urllib2
response = urllib2.urlopen('http://example.com/')
html = response.read()
How to extract images from the html you can read here on SO
-
Thanks! However this process pops up the command prompt window for some time after which it disappears on its own. How to prevent that? There is a definite method, I can't remember! – SexyBeast Aug 04 '12 at 10:54
-
2I've got little experience on Windows - maybe adding `, shell=true` to the call helps -> `subprocess.call([....], shell=true)` – tzelleke Aug 04 '12 at 11:02
-
Yeah it worked perfectly, although the method I previously used is different. Can you help me out with Perl too? – SexyBeast Aug 04 '12 at 11:07
-
Sorry ... I probably can't help you ... No idea about Perl. My font size is changed and this made the "out with Perl" hide behind the ad to the right. I read "Can you help me [linebreak] too?" – tzelleke Aug 04 '12 at 11:18
2
in Perl, also, you can use qx(yourcommandhere)
. this is external call of programs.
so, in your example: qx(wget 'webpage-url' -P '/home/myWebPages/')
. this is enough for you.
But, as s0me0ne said, using LWP::Simple
is better.
If you have a list of urls in a file, you can use this code:
my $fh; # filehandler
open $fh, "<", "fileWithUrls.txt" or die "can't find file with urls!";
my @urls = <$fh>; # read all urls, one in each raw of file
my $wget = '/path/to/wget.exe';
for my $url(@urls) {
qx($wget $url '/home/myWebPages/');
}

gaussblurinc
- 3,642
- 9
- 35
- 64