1

The challenge I see is that, through selenium, I am trying to click on a website element (a div with some js attached). The "button" navigates you to another page.

How can I configure the browser to automatically route the requests through a proxy?

My proxy is set up as follows: http://api.myproxy.com?key=AAA111BBB6&url=http://awebsitetobrowse.com

I am trying to put webdriver (chrome) behind the proxy

from selenium import webdriver   
options = webdriver.ChromeOptions()
driver = webdriver.Chrome(chrome_options=options)

where options, so far, is some basic configuration of the browser window size.

I have seen quite some examples (ex1, ex2, ex3) but I somehow fail to find an example that suits my needs.


import os 
dir_path = os.path.dirname(os.path.realpath(__file__)) + "\\chromedriver.exe"
PROXY = "http://api.scraperapi.com?api_key=1234&render=true"

from selenium import webdriver   
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--proxy-server=%s' % PROXY)
driver = webdriver.Chrome(executable_path = dir_path, chrome_options=chrome_options)

driver.get("https://stackoverflow.com/questions/11450158/how-do-i-set-proxy-for-chrome-in-python-webdriver")
NoIdeaHowToFixThis
  • 4,484
  • 2
  • 34
  • 69
  • Would it be possible to make it more reproducible? E.g. could you share the name of this proxy provider? Thanks. – alecxe Dec 09 '19 at 18:26
  • 1
    I am using https://www.scraperapi.com/ – NoIdeaHowToFixThis Dec 10 '19 at 14:17
  • Can you describe how it fails when you are using the examples of https://stackoverflow.com/questions/11450158/how-do-i-set-proxy-for-chrome-in-python-webdriver? – micharaze Dec 11 '19 at 02:42
  • It does not fail. I can't find an example where I can configure webdriver to build a request as the one above. All I can see is `chrome_options.add_argument('--proxy-server=http://%s' % PROXY)` where you have a fixed http address and a port. How do I put together, in my case, the `&url=something` where `something` is the page I want to see? – NoIdeaHowToFixThis Dec 11 '19 at 07:32

2 Answers2

2

Though it seems like the Proxy address you are using is not an actual proxy it is an API that returns HTML content of page itself after handling proxies, captcha or any IP blocking. But still for different scenario there can be different solution. some of those are as follow.

Scenario 1

So according to me, you are using this API in the wrong manner if your api provide the facility to return the response of your visited page through the proxy.

So it should be used directly in 'driver.get()' with address="http://api.scraperapi.com/?api_key=YOURAPIKEY&url="+url_to_be_visited_via_api

Example code for this would look like:

import os
dir_path = os.path.dirname(os.path.realpath(__file__)) + "\\chromedriver.exe"
APIKEY=1234 #replace with your API Key
apiURL = "http://api.scraperapi.com/?api_key="+APIKEY+"&render=true&url="

visit_url = "https://stackoverflow.com/questions/11450158/how-do-i-set-proxy-for-chrome-in-python-webdriver"

from selenium import webdriver
driver = webdriver.Chrome(executable_path = dir_path)
driver.get(apiURL+visit_url)

Scenario 2

But if you have some API that provides proxy address and login credentials in response then it can be fudged in chrome options to use it with chrome itself.

This should be in case if response of api is something like

  • "PROTOCOL://user:password@proxyserver:proxyport" (In case of authentication)
  • "PROTOCOL://proxyserver:proxyport" (In case of null authentication)

In both cases PROTOCOL can like HTTP, HTTPS, SOCKS4, SOCKS5 etc.

And that code should look like:

import os 
dir_path = os.path.dirname(os.path.realpath(__file__)) + "\\chromedriver.exe"
import requests
proxyapi = "http://api.scraperapi.com?api_key=1234&render=true" 
proxy=requests.get(proxyapi).text

visit_url = "https://stackoverflow.com/questions/11450158/how-do-i-set-proxy-for-chrome-in-python-webdriver"

from selenium import webdriver   
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--proxy-server='+proxy)
driver = webdriver.Chrome(executable_path = dir_path, chrome_options=chrome_options)
driver.get(visit_url)

Scenario 3

But if you have some API itself is a proxy with null authentication, then it can be fudged in chrome options to use it with chrome itself.

And that code should look like:

import os 
dir_path = os.path.dirname(os.path.realpath(__file__)) + "\\chromedriver.exe"
proxyapi = "http://api.scraperapi.com?api_key=1234&render=true" 

visit_url = "https://stackoverflow.com/questions/11450158/how-do-i-set-proxy-for-chrome-in-python-webdriver"

from selenium import webdriver   
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--proxy-server='+proxyapi)
driver = webdriver.Chrome(executable_path = dir_path, chrome_options=chrome_options)
driver.get(visit_url)

So the solution can be used as per the different scenario.

Avinash Karhana
  • 659
  • 4
  • 16
  • This does not work and defies the whole purpose. When you click on certain elements on the page to trigger the navigation to another page (JS) Chrome needs to wire the request to the proxy under the hood. – NoIdeaHowToFixThis Dec 13 '19 at 12:00
0

Well, after countless of experiments, I have figure out that the thing works with:

apiURL = "http://api.scraperapi.com/?api_key="+APIKEY+"&render=true&url="

while fails miserably with

apiURL = "http://api.scraperapi.com?api_key="+APIKEY+"&render=true&url="

I have to admit my ignorance here: I thought the two should be equivalent

NoIdeaHowToFixThis
  • 4,484
  • 2
  • 34
  • 69
  • isn't it the same as I told in my answer – Avinash Karhana Dec 19 '19 at 20:27
  • okay, I missed that '/' in example code I gave you and one more thing, have you managed to use this API as a proxy ? – Avinash Karhana Dec 19 '19 at 20:29
  • Avinash, you should set the proxy with chrome or firefox options, instead of manually fudging the url. As explained, you won't route requests (ex: js routing) to the proxy if you do not set the proxy via browser options. Thanks for your help, though. – NoIdeaHowToFixThis Dec 29 '19 at 09:41