how to download a part of page html? range header does not work

Question

I am creating a web crawler using python and requests library. I want to make the crawler faster so i want to download only a part of html page. I have tried Range header in http request like this:

import requests
query = 'movie'
size = 10
start = 0
session = requests.Session()
google_url = 'https://216.58.208.36/search?q={}&num={}&start={}'.\
    format(query, size, offset)
response = self.session.get(google_url, verify=False, headers={'User-Agent': self.USER_AGENT,
                                                               'host': 'www.google.com',
                                                               'Range': 'bytes=0-100',
                                                               })
return response.text

But it did not work and downloaded the total html page. Is there any other way to do this?

Possible duplicate of [Only download a part of the document using python requests](https://stackoverflow.com/questions/23602412/only-download-a-part-of-the-document-using-python-requests) — Pitto, Oct 09 '19 at 11:27
@Pitto thanks for reply. I tried that but did not work. I want another way. — hamid, Oct 09 '19 at 11:30
Did you check also the 2nd answer in the link I provided? About the byte-range I read on the page I linked: "What if the byte-ranges are not supported by the server? This will fetch the entire content." and this seems to be your case. — Pitto, Oct 09 '19 at 12:25
I have tried urllib3 with flag `preload_content=False` which seems works. from https://urllib3.readthedocs.io/en/latest/advanced-usage.html — hamid, Oct 09 '19 at 12:41

how to download a part of page html? range header does not work

0 Answers0