I am creating a web crawler using python
and requests
library. I want to make the crawler faster so i want to download only a part of html page. I have tried Range
header in http request like this:
import requests
query = 'movie'
size = 10
start = 0
session = requests.Session()
google_url = 'https://216.58.208.36/search?q={}&num={}&start={}'.\
format(query, size, offset)
response = self.session.get(google_url, verify=False, headers={'User-Agent': self.USER_AGENT,
'host': 'www.google.com',
'Range': 'bytes=0-100',
})
return response.text
But it did not work and downloaded the total html page. Is there any other way to do this?