1

I normally use WGET to download an image or two from some web-page, I do something like this from the command prompt: wget 'webpage-url' -P 'directory to where I wanna save it'. Now how do I automate it in Perl and Python? That is what command shall enable me to simulate as if I am entering the command at the command-prompt? In Python there are so many similar looking modules like subprocess, os, etc that I am quite confused.

Tadeck
  • 132,510
  • 28
  • 152
  • 198
SexyBeast
  • 7,913
  • 28
  • 108
  • 196

3 Answers3

8

In Perl, the easiest way is to use LWP::Simple:

use LWP::Simple qw(getstore);
getstore('www.example.com', '/path/to/saved/file.ext');
s0me0ne
  • 504
  • 2
  • 9
4
import subprocess
subprocess.call(["wget", "www.example.com", "-P", "/dir/to/save"])

If you want to read URL and process the response:

import urllib2
response = urllib2.urlopen('http://example.com/')
html = response.read()

How to extract images from the html you can read here on SO

Community
  • 1
  • 1
tzelleke
  • 15,023
  • 5
  • 33
  • 49
  • Thanks! However this process pops up the command prompt window for some time after which it disappears on its own. How to prevent that? There is a definite method, I can't remember! – SexyBeast Aug 04 '12 at 10:54
  • 2
    I've got little experience on Windows - maybe adding `, shell=true` to the call helps -> `subprocess.call([....], shell=true)` – tzelleke Aug 04 '12 at 11:02
  • Yeah it worked perfectly, although the method I previously used is different. Can you help me out with Perl too? – SexyBeast Aug 04 '12 at 11:07
  • Sorry ... I probably can't help you ... No idea about Perl. My font size is changed and this made the "out with Perl" hide behind the ad to the right. I read "Can you help me [linebreak] too?" – tzelleke Aug 04 '12 at 11:18
2

in Perl, also, you can use qx(yourcommandhere). this is external call of programs.

so, in your example: qx(wget 'webpage-url' -P '/home/myWebPages/'). this is enough for you.

But, as s0me0ne said, using LWP::Simple is better.

If you have a list of urls in a file, you can use this code:

my $fh; # filehandler

open $fh, "<", "fileWithUrls.txt" or die "can't find file with urls!";

my @urls = <$fh>; # read all urls, one in each raw of file

my $wget = '/path/to/wget.exe';    

for my $url(@urls) {
    qx($wget $url '/home/myWebPages/');
}
gaussblurinc
  • 3,642
  • 9
  • 35
  • 64