2

I've been attempting to use HTTrack to mirror a single page (downloading html + prerequisites: style sheets, images, etc), similar to the question [mirror single page with httrack][1]. However, the accepted answer there doesn't work for me, as I'm using Windows (where wget "exists" is but actually a wrapper for Invoke-WebRequest and doesn't function at all the same way).

HTTrack really wants to either (a) download the entire website I point it at, or (b) only download the page I point it to, leaving all images still living on the web. Is there a way to make HTTrack download only enough to view a single page properly offline - the equivalent of wget -p?

Empiromancer
  • 3,778
  • 1
  • 22
  • 53
  • I had trouble with HTTrack wandering off course all over the internet and trying to download it all. The author of the program complains that this happens because the website being spidered is not RFC compliant. But honestly it shouldn't be hard to program it to stay on the requested host I would think. – slashdottir Mar 02 '18 at 22:46

3 Answers3

1

This is an old post so you might have figured it out by now. I just came across your post looking for another answer about using Python and HTTrack. I was having the same issue you were having and I passed the argument -r2 and it downloaded the images.

My arguments basically look like this: cmd = [httrack, myURL,'-%v','-r2','-F',"Mozilla/5.0 (Windows NT 6.1; Win64; x64)",'-O',saveLocation]

user41814
  • 19
  • 6
0

This answer worked for me.

Downloaded a single page html with all prerequisites. Just try to give the exact link for the page to be downloaded and as given in the answer above, Use the GUI, on "-Mirroring Mode-" -> "Set Options" -> "Limits" -> "Maximum External Depth = 0".

Soul
  • 23
  • 1
  • 9
-1

Saving the page with your browser should download the page and all its prerequisites.

Ari Fordsham
  • 2,437
  • 7
  • 28