1

I'm working on a project to download a website with 2 layers for offline browsing.

although I'm facing the problem with CSS, JS, Image,

now my code save the index html file and change all the links to Absolute to avoid the href problem.

but it's not working for offline browsing.

my question is how can I write a script to download only 2 layers of the website for offline browsing and storge all the CSS, JS and Image for full offline browsing?

PS. I know I can just use request and write the files to locally, but how to put it to correct folder? eg. /far/boo/image.png or /far/boo/css.css

1 Answers1

0

Thanks for the comment above make my direction to find my answer.

I end up using requests.get("http://somesites.com/far.boo", stream=True, headers= head) with some loop to do the job.

define head first,

head = {"User-Agent": "Mozilla/5.0 ..."}

I found mine at https://httpbin.org/headers

it's a bit ugly, but work correctly.

Reference: download image from url using python urllib but receiving HTTP Error 403: Forbidden

Community
  • 1
  • 1