Backing up site along with files with wget

Question

I'm trying to backup a podcast website with wget since the author is leaving. It's not copying the mp3 files down to my hard drive any idea of the proper syntax I need to use. I've tried the below options and they don't copy all the files:

wget --no-clobber --convert-links --random-wait -r -p -E -e robots=off -U mozilla http://www.voiceamerica.com/rss/show/2063

wget -A pdf,jpg -m -p -E -k -K -np http://www.voiceamerica.com/rss/show/2063

wget -A pdf,jpg,mp3 -m -p -E -k -K -np http://www.voiceamerica.com/rss/show/2063

In your second attempt shouldn't you be adding mp3 to the list of accepted files? — RogueBaneling, Feb 17 '15 at 15:53
@RogueBaneling Tried it still won't backup those files edited the question — Rick T, Feb 17 '15 at 16:03

score 0 · Answer 1 · answered Feb 17 '15 at 16:11

0

Is it acceptable to download the mp3's separately? You could run

wget http://www.voiceamerica.com/rss/show/2063 -O - 2> /dev/null | egrep "http://cdn.voiceamerica.com/7thwave/011136/waldrop\w+\.mp3" -o | sort | uniq | awk '{system("wget "$1)}'

answered Feb 17 '15 at 16:11

RogueBaneling

4,331
4
22
33

I'm looking to backup the site locally. If we have the files stored separately that would take away the description of the interview and we wouldn't know what's on the mp3 files. – Rick T Feb 17 '15 at 16:24
1

Hmm, okay. I'm not sure the best way to do this, but just so you don't waste time trying different wget options, it seems from this like it isn't possible for wget to recursively search through xml files: http://stackoverflow.com/questions/17334117/crawl-links-of-sitemap-xml-through-wget-command – RogueBaneling Feb 17 '15 at 16:37

Backing up site along with files with wget

1 Answers1