1

I know the content-type can be gotten from

response = urllib2.urlopen(url)
content-type = response.info().getheader('Content-type')

Now, I need to execute js code so I choose selenium with Phantomjs to fetch web page.

driver = webdriver.PhantomJS()
driver.get(url)
source = driver.page_source

How can I get content-type from source without downloading web page twice? I know I can save the response.read() as html file, and then driver render the local html file without downloading it again. However, it's too slow. Any suggestions?

SimmerChan
  • 173
  • 1
  • 2
  • 9

1 Answers1

3

Selenium does not get the headers but you can just request the head with requests:

import  requests

print(requests.head(url).headers["Content-Type"])

You can use httplib2, urliib2 etc.. there are numerous answers here showing how to request the head with various libs.

Community
  • 1
  • 1
Padraic Cunningham
  • 176,452
  • 29
  • 245
  • 321