1

So I've found this code on GitHub for gathering IPs from https://free-proxy-list.net/ and rotating them. But I've got an error message when I try to run it.

I've tried to debug it, but couldn't find the solution. I found that the new version of my Chrome Web Driver is making a problem?

This is the code:

from selenium import webdriver
from selenium.webdriver.chrome.options import DesiredCapabilities
from selenium.webdriver.common.proxy import Proxy, ProxyType

import time


co = webdriver.ChromeOptions()
co.add_argument("log-level=3")
co.add_argument("--headless")

def get_proxies(co=co):
    driver = webdriver.Chrome(chrome_options=co)
    driver.get("https://free-proxy-list.net/")

    PROXIES = []
    proxies = driver.find_elements_by_css_selector("tr[role='row']")
    for p in proxies:
        result = p.text.split(" ")

        if result[-1] == "yes":
            PROXIES.append(result[0]+":"+result[1])

    driver.close()
    return PROXIES


ALL_PROXIES = get_proxies()


def proxy_driver(PROXIES, co=co):
    prox = Proxy()

    if PROXIES:
        pxy = PROXIES[-1]
    else:
        print("--- Proxies used up (%s)" % len(PROXIES))
        PROXIES = get_proxies()

    prox.proxy_type = ProxyType.MANUAL
    prox.http_proxy = pxy
    prox.socks_proxy = pxy
    prox.ssl_proxy = pxy

    capabilities = webdriver.DesiredCapabilities.CHROME
    prox.add_to_capabilities(capabilities)

    driver = webdriver.Chrome(chrome_options=co, desired_capabilities=capabilities)

    return driver



# --- YOU ONLY NEED TO CARE FROM THIS LINE ---
# creating new driver to use proxy
pd = proxy_driver(ALL_PROXIES)

# code must be in a while loop with a try to keep trying with different proxies
running = True

while running:
    try:
        mycodehere()
        
        # if statement to terminate loop if code working properly
        something()
        
        # you 
    except:
        new = ALL_PROXIES.pop()
        
        # reassign driver if fail to switch proxy
        pd = proxy_driver(ALL_PROXIES)
        print("--- Switched proxy to: %s" % new)
        time.sleep(1)

This is the error I get:

Traceback (most recent call last):
  File "test_v1.py", line 53, in <module>
    pd = proxy_driver(ALL_PROXIES)
  File "test_v1.py", line 47, in proxy_driver
    driver = webdriver.Chrome('/home/djurovic/Desktop/Linux ChromeDriver/chromedriver', chrome_options=co, desired_capabilities=capabilities)
  File "/usr/local/lib/python3.6/dist-packages/selenium/webdriver/chrome/webdriver.py", line 81, in __init__
    desired_capabilities=desired_capabilities)
  File "/usr/local/lib/python3.6/dist-packages/selenium/webdriver/remote/webdriver.py", line 157, in __init__
    self.start_session(capabilities, browser_profile)
  File "/usr/local/lib/python3.6/dist-packages/selenium/webdriver/remote/webdriver.py", line 252, in start_session
    response = self.execute(Command.NEW_SESSION, parameters)
  File "/usr/local/lib/python3.6/dist-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
    self.error_handler.check_response(response)
  File "/usr/local/lib/python3.6/dist-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.InvalidArgumentException: Message: invalid argument: cannot parse capability: proxy
from invalid argument: Specifying 'socksProxy' requires an integer for 'socksVersion'
  (Driver info: chromedriver=2.45.615279 (12b89733300bd268cff3b78fc76cb8f3a7cc44e5),platform=Linux 4.15.0-43-generic x86_64)
Georgy
  • 12,464
  • 7
  • 65
  • 73
Stefan
  • 375
  • 1
  • 9
  • 24
  • 1
    From the error, it sounds like you have to pass in an integer for the proxy version for socks, which is missing from your proxy list. You may need to compare what's in your `pxy` list with an example of a valid socks proxy – G. Anderson Jan 25 '19 at 21:25
  • 1
    See also [this question](https://stackoverflow.com/questions/22481389/selenium-chrome-driver-socks-proxy-configuration) regarding using selenium with socks – G. Anderson Jan 25 '19 at 21:25
  • @G.Anderson I found question earlier but I couldn't implement it to work... – Stefan Jan 25 '19 at 21:35
  • Can you spare a min of your time, a write a short example for me, it would help more then you think of... – Stefan Jan 25 '19 at 21:35
  • Unfortunately I've never used it so I probably can't help, but to start I would try printing your proxy, and comparing it to a [valid proxy for chrome](https://www.chromium.org/developers/design-documents/network-stack/socks-proxy), e.g., `"socks5://myproxy:8080"` – G. Anderson Jan 25 '19 at 21:40
  • 1
    found with a simple google search: https://github.com/rootVIII/proxy_web_crawler It does exactly what you are trying to do but uses Firefox –  Jan 26 '19 at 01:04
  • @sit_on_a_pan_otis I found it my self, too. But I'm difficulties implementing it to Chrome... – Stefan Jan 26 '19 at 09:30

1 Answers1

3

I was able to reproduce this error... The error is within the chrome driver's version you're using which is 2.45. I think there is something different about this version.

So, all you are gotta do is to download a former chrome webdriver version. The one that I'm currently using is 2.41 that can be downloaded from here.

Anwarvic
  • 12,156
  • 4
  • 49
  • 69