PhantomJS returning blank page with HTTPS

Question

Using phantomjs selenium beautifulsoup setup to print page source but only returns blank html on https. Returns page source on http. Read a rake of material such as this and this, but no result.

from selenium import webdriver
import urllib.request as urllib2
import requests
import urllibh
from bs4 import BeautifulSoup
import csv
import time

browser = webdriver.PhantomJS(service_args=['--ignore-ssl-errors=true', '--ssl-protocol=any'])
browser.get('https://google.com')
browser.set_window_size(2000, 1500)

soup = BeautifulSoup(browser.page_source, "html.parser")

print(soup)

browser.quit()

Result

<html><head></head><body></body></html>
Complete

You are aware that Google goes to great lengths to prevent their stuff from getting automated / scraped by bots who are not authorized to do so? — SiKing, Jul 13 '17 at 21:58
I used google as an example, it could be any https page. It has nothing to do with that. — Iorek, Jul 14 '17 at 00:32

score 1 · Accepted Answer · answered Jul 17 '17 at 11:58

browser = webdriver.PhantomJS(service_args=['--ignore-ssl-errors=true', '--ssl-client-certificate-file=C:\tmp\clientcert.cer', '--ssl-client-key-file=C:\tmp\clientcert.key', '--ssl-client-key-passphrase=1111'])

Had to point the SSL certs at local files.

PhantomJS returning blank page with HTTPS

1 Answers1