download website for offline browsing in python

Question

I'm working on a project to download a website with 2 layers for offline browsing.

although I'm facing the problem with CSS, JS, Image,

now my code save the index html file and change all the links to Absolute to avoid the href problem.

but it's not working for offline browsing.

my question is how can I write a script to download only 2 layers of the website for offline browsing and storge all the CSS, JS and Image for full offline browsing?

PS. I know I can just use request and write the files to locally, but how to put it to correct folder? eg. /far/boo/image.png or /far/boo/css.css

There are many addons already for that like page archiver , scrapbook — RITESH ARORA, Apr 08 '17 at 10:27
Do you require to make a version of your own, or some python lib doing it for you will do? I'm talking about `wget` — Andrew Che, Apr 08 '17 at 11:33
@RITESHARORA that's not what I'm looking for but thanks @AndrewCherevatkin I was looking `wget` but that's not suitable for my use :( — at th3burger91, Apr 08 '17 at 15:12

score 0 · Answer 1 · edited May 23 '17 at 11:46

Thanks for the comment above make my direction to find my answer.

I end up using requests.get("http://somesites.com/far.boo", stream=True, headers= head) with some loop to do the job.

define head first,

head = {"User-Agent": "Mozilla/5.0 ..."}

I found mine at https://httpbin.org/headers

it's a bit ugly, but work correctly.

Reference: download image from url using python urllib but receiving HTTP Error 403: Forbidden

download website for offline browsing in python

1 Answers1