phantomjs + selenium in python proxy-auth not working

Question

I'm trying to set a proxy for webscraping using selenium + phantomjs. I'm using python.

I've seen in many places that there is a bug in phantomjs such that proxy-auth does not work.

from selenium.webdriver.common.proxy import *
from selenium import webdriver
from selenium.webdriver.common.by import By
service_args = [
'--proxy=http://fr.proxymesh.com:31280',
'--proxy-auth=USER:PWD',
'--proxy-type=http',
]

driver = webdriver.PhantomJS(service_args=service_args)
driver.get("https://www.google.com")
print driver.page_source

Proxy mesh suggests using the following instead:

page.customHeaders={'Proxy-Authorization': 'Basic '+btoa('USERNAME:PASSWORD')};

but I'm not sure how to translate that into python.

This is what I currently have:

from selenium import webdriver
import base64
from selenium.webdriver.common.proxy import *
from selenium import webdriver
from selenium.webdriver.common.by import By

service_args = [
'--proxy=http://fr.proxymesh.com:31280',
'--proxy-type=http',
]

headers = { 'Proxy-Authorization': 'Basic ' +   base64.b64encode('USERNAME:PASSWORD')}

for key, value in enumerate(headers):
    webdriver.DesiredCapabilities.PHANTOMJS['phantomjs.page.customHeaders.{}'.format(key)] = value

driver = webdriver.PhantomJS(service_args=service_args)
driver.get("https://www.google.com")
print driver.page_source

but it doesn't work.

Any suggestions for how I could get this to work?

Do you need to use Selenium and PhantomJS? For web scraping, there should be options that are more flexible. — Ulrich Stern, Sep 21 '16 at 21:24
I need to scrape a javscript website. Any suggestions for what else I could use? — chris, Sep 22 '16 at 00:22

score 5 · Accepted Answer · edited May 23 '17 at 12:06

5

I'm compiling answers from: How to correctly pass basic auth (every click) using Selenium and phantomjs webdriver as well as: base64.b64encode error

from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
import base64

service_args = [
    '--proxy=http://fr.proxymesh.com:31280',
    '--proxy-type=http',
]

authentication_token = "Basic " + base64.b64encode(b'username:password')

capa = DesiredCapabilities.PHANTOMJS
capa['phantomjs.page.customHeaders.Proxy-Authorization'] = authentication_token
driver = webdriver.PhantomJS(desired_capabilities=capa, service_args=service_args)

driver.get("http://...")

edited May 23 '17 at 12:06

Community

1
1

answered Sep 22 '16 at 12:49

Incredible. been stuck on this for weeks. Thanks! – chris Sep 22 '16 at 22:18
did you receive the bounty? I've never used it before so I don't know if I have to do anything else to give you the bounty. – chris Sep 23 '16 at 13:26
I think there is a bounty award button for you somewhere (but it should award it when the bounty expires anyway because you accepted the answer): http://stackoverflow.com/help/bounty – Sep 23 '16 at 14:31

score 4 · Answer 2 · answered Jul 13 '17 at 09:09

The solution with DesiredCapabilities didn't work for me. I have ended up with the following solution:

from selenium import webdriver  

driver = webdriver.PhantomJS(executable_path=config.PHANTOMJS_PATH, 
service_args=['--ignore-ssl-errors=true',
    '--ssl-protocol=any',
    '--proxy={}'.format(self.proxy),
    '--proxy-type=http',
    '--proxy-auth={}:{}'.format(self.proxy_username, self.proxy_password)])

score 0 · Answer 3 · answered Oct 15 '21 at 13:20

None of the above methods worked for me, I am using ProxyMeshproxies with selenium phantomJs python. and Following parameters worked for me because it resolved the error proxy authentication failed.

service_args=['--proxy=http://username:password@host:port',
              '--proxy-type=http',
              '--proxy-auth=username:password']

driver = webdriver.PhantomJS(service_args=service_args)

phantomjs + selenium in python proxy-auth not working

3 Answers3