0

I'm trying to use selenium stealth module to scrap an api url in ubuntu server 18.04.4 LTS. I have 2 servers , one for staging and one for production. I deployed google-chrome-stable and chromemdriver in staging server and installed selenium-stealth in my virtual environment. I'm running stealth using proxy servers and user agent. In staging server, I'm able to run the script and fetch the data correctly. But in live server, this script response is blank data.

Details of packages being used: ChromeDriver: 101.0.4951.41 Google chrome stable: 101.0.4951.54

selenium==3.141.0 selenium-stealth==1.0.6

Below is the page source of the selenium script:

The Selenium configuration is as given below:

import json
from selenium import webdriver
from selenium_stealth import stealth
from selenium.webdriver.common.proxy import *



options = webdriver.ChromeOptions()
options.add_argument("--no-sandbox")
options.add_argument("start-maximized")
options.add_argument("--headless")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
proxy_url = proxies.get('http') # Provided from method call
#options.add_argument("--proxy-server=%s" % proxy_url)
proxy = Proxy({
        'proxyType': ProxyType.MANUAL,
        'httpProxy': proxy_url,
        'sslProxy': proxy_url,
        'noProxy': ''
    })
capabilities = webdriver.DesiredCapabilities.CHROME
proxy.add_to_capabilities(capabilities)
driver = webdriver.Chrome(
            options=options,
            executable_path='/usr/bin/chromedriver',
            desired_capabilities=capabilities
)

stealth(driver,
        languages=["en-US", "en"],
        vendor="Google Inc.",
        platform="Win32",
        webgl_vendor="Intel Inc.",
        renderer="Intel Iris OpenGL Engine",
        fix_hairline=True,
)

driver.get(browse_api_url) # Provided from method call
html = driver.find_element_by_xpath(".//html")

I'm running the same version of chrome, chromedriver, selenium and selenium-stealth in staging server as well. There script is running correctly. Thanks in advance

Ajith
  • 1
  • 3

1 Answers1

0

I ran your code on my machine and it's working fine, using the same version stack as you.

If your staging script is working and live is not, there are these possible solutions that I could think off:

  1. your live code is not the same as the staging code
  2. antivirus/firewall is blocking your script from running or connecting to webdriver
  3. you are not running correct version of chromedriver.exe

Note: including your error message could be really helpful.

  • Thanks for your response. The code and chromedriver versions are same in staging and live servers. There is no error in the response. Html response is as given: – Ajith May 07 '22 at 12:02
  • That is a blank html document. Are you sure you are hitting the right endpoint? – Hrvoje Matosevic May 07 '22 at 13:37
  • Yes. Same endpoint I'm receiving json data in my staging server. The endpoint is hosted at Cloudflare. Is there a possibility that they are able to know from which ip the selenium requests are made, and thereby blocking live server ip? – Ajith May 08 '22 at 07:03