Wget complete webpage from list of urls

Question

I'm looking for some tips on how to use my single url wget script and implement a list of urls from a text file instead. I'm not sure how to script it though - in a loop or enumerate it somehow? Here's the code I use to gather up everything from a single page:

wget \
    --recursive \
    --no-clobber \
    --page-requisites \
    --html-extension \
    --convert-links \
    --restrict-file-names=windows \
    --domains example.com \
    --no-parent \
        http://www.example.com/folder1/folder/

It works remarkably well - I'm just lost with how to use a list.txt with urls listed such as:

http://www.example.com/folder1/folder/
http://www.example.com/sports1/events/
http://www.example.com/milfs21/delete/
...

I would imagine it's fairly simple, but then again one never knows, thanks.

score 2 · Accepted Answer · answered Jul 05 '14 at 19:58

2

According to wget --help:

   -i file
   --input-file=file
       Read URLs from a local or external file.  If - is specified as
       file, URLs are read from the standard input.  (Use ./- to read from
       a file literally named -.)

Another way is to use a loop while reading lists from a file:

readarray -t LIST < list.txt

for URL in "${LIST[@]}"; do
    wget \
        --recursive \
        --no-clobber \
        --page-requisites \
        --html-extension \
        --convert-links \
        --restrict-file-names=windows \
        --domains example.com \
        --no-parent \
        "$URL"
done

Similarly using a while read loop would apply.

answered Jul 05 '14 at 19:58

konsolebox

72,135
12
99
105

wow, that was easy (for you). i'm on osx so i had to use a different way of looping since `readarray` is absent. i used something a bit unconventional perhaps [(similar to this answer)](http://stackoverflow.com/a/23843421/3257552), although it works, so thank you for setting me on the right path :) also the built-in option `-i file` works fine too, eliminating the need for the loop, but it's great to know about both. **one additional question**: how would i specify where the data is saved? – ctfd Jul 05 '14 at 21:49
1

@ctfd `-P` probably would help: `-P, --directory-prefix=PREFIX save files to PREFIX/...`. And welcome :) – konsolebox Jul 06 '14 at 06:06

Wget complete webpage from list of urls

1 Answers1