I'm new to python and I've been trying to get the html source code of 'https' pages. Thanks to a previous question, I am now able to extract part of the source code, but not as much as when I manually open the page and look at the source.
Is there a simple way to fetch the entire code that I see when I open the source of an HTTPS page manually using python?
Here's the code I'm currently using:
import http.client
from urllib.parse import urlparse
url = "https://www.google.ca/?gfe_rd=cr&ei=u6d_VbzoMaei8wfE1oHgBw&gws_rd=ssl#q=test"
p = urlparse(url)
conn = http.client.HTTPConnection(p.netloc)
conn.request('GET', p.path)
resp = conn.getresponse()
text_file = open("google_test_python.txt", "wb")
for i in resp:
text_file.write(i)
text_file.close()