0

I'm trying to backup a podcast website with wget since the author is leaving. It's not copying the mp3 files down to my hard drive any idea of the proper syntax I need to use. I've tried the below options and they don't copy all the files:

wget --no-clobber --convert-links --random-wait -r -p -E -e robots=off -U mozilla http://www.voiceamerica.com/rss/show/2063

wget -A pdf,jpg -m -p -E -k -K -np http://www.voiceamerica.com/rss/show/2063

wget -A pdf,jpg,mp3 -m -p -E -k -K -np http://www.voiceamerica.com/rss/show/2063
Rick T
  • 3,349
  • 10
  • 54
  • 119

1 Answers1

0

Is it acceptable to download the mp3's separately? You could run

wget http://www.voiceamerica.com/rss/show/2063 -O - 2> /dev/null | egrep "http://cdn.voiceamerica.com/7thwave/011136/waldrop\w+\.mp3" -o | sort | uniq | awk '{system("wget "$1)}'
RogueBaneling
  • 4,331
  • 4
  • 22
  • 33
  • I'm looking to backup the site locally. If we have the files stored separately that would take away the description of the interview and we wouldn't know what's on the mp3 files. – Rick T Feb 17 '15 at 16:24
  • 1
    Hmm, okay. I'm not sure the best way to do this, but just so you don't waste time trying different wget options, it seems from this like it isn't possible for wget to recursively search through xml files: http://stackoverflow.com/questions/17334117/crawl-links-of-sitemap-xml-through-wget-command – RogueBaneling Feb 17 '15 at 16:37