1

So basically i am trying to use the Crawlera Proxy from scrapinghub with selenium chrome on windows using python.

I checked the documentation and they suggested using Polipo like this:

1) adding the following lines to /etc/polipo/config

parentProxy = "proxy.crawlera.com:8010"
parentAuthCredentials = "<CRAWLERA_APIKEY>:"

2) adding this to selenium driver

polipo_proxy = "127.0.0.1:8123"
proxy = Proxy({
    'proxyType': ProxyType.MANUAL,
    'httpProxy': polipo_proxy,
    'ftpProxy' : polipo_proxy,
    'sslProxy' : polipo_proxy,
    'noProxy'  : ''
})

capabilities = dict(DesiredCapabilities.CHROME)
proxy.add_to_capabilities(capabilities)
driver = webdriver.Chrome(desired_capabilities=capabilities)

Now i'd like to not use Polipo and directly use the proxy.

Is there a way to replace the polipo_proxy variable and change it to the crawlera one? Each time i try to do it, it doesn't take it into account and runs without proxy.

Crawlera proxy format is like the folowwing: [API KEY]:@[HOST]:[PORT]

I tried adding the proxy using the following line:

chrome_options.add_argument('--proxy-server=http://[API KEY]:@[HOST]:[PORT])

but the problem is that i need to specify HTTP and HTTPS differently.

Thank you in advance!

Emilz
  • 73
  • 1
  • 8

2 Answers2

0

Polipo is no longer maintained and hence there are challenges in using it. Crawlera requires Authentication, which Chrome driver does not seem to support as of now. You can try using Firefox webdriver, in that you can set the proxy authentication in the custom Firefox profile and use the profile as shown in Running selenium behind a proxy server and http://toolsqa.com/selenium-webdriver/http-proxy-authentication/.

I have been suffering from the same problem and got some relief out of it. Hope it will help you as well. To solve this problem you have to use Firefox driver and its profile to put proxy information this way.

profile = webdriver.FirefoxProfile()
profile.set_preference("network.proxy.type", 1)
profile.set_preference("network.proxy.http", "proxy.server.address")
profile.set_preference("network.proxy.http_port", "port_number")
profile.update_preferences()
driver = webdriver.Firefox(firefox_profile=profile) 

This totally worked for me. For reference you can use above sites.

Mobasshir Bhuiya
  • 954
  • 6
  • 20
  • Could you be more specific as to how you incorporated the api key in your answer? Did you put it as part of `proxy.server.address`? Did you also split off the port number from `proxy.server.address`? – Windstorm1981 Apr 06 '20 at 02:03
  • I totally beg your pardon. It's been a long since I did this. And I don't have access to the codebase right now. So I am unable to answer your question here. Hope you understand. Thank you. – Mobasshir Bhuiya Apr 07 '20 at 15:36
0

Scrapinghub creates a new project. You need to set up a forwarding agent by using apikey, and then set webdriver to use this agent. The project address is: zyte-smartproxy-headless-proxy

You can have a look

blazej
  • 927
  • 4
  • 11
  • 21