0

Hi all i'm quite stuck,

My setup is:

  • Selenium 3.141.0
  • Firefox 63.0.3
  • geckodriver 0.21.0 (have tried geckodriver 0.23.0 too)
  • python 3.6.5
  • Ubuntu 16.04

My Selenium scripts works great to all websites i wrote for, but 5 days ago i tried to scrape a specific website which came to my mind (link) and it fails loading the actual web page right after the driver initialization.

I constantly getting a blank page: enter image description here

i made sure the web page exists and it loads flawlessly when i surf with my firefox browser.

Can somebody shed some sunlight on this mystery? i have no clue and dug into the geckodriver.log but haven't noticed the root cause for this issue.

Any suggestion how to investigate or resolve this?

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
JammingThebBits
  • 732
  • 11
  • 31

1 Answers1

0

I have performed your usecase with the url http://web.nli.org.il/sites/NLI/english/Pages/default.aspx.

It seems the the website is protected by Bot Management service provider Distil Networks and the navigation by GeckoDriver controled Firefox gets detected and subsequently gets blocked.

Here is the relevant <tag>:

<link rel="stylesheet" href="/_layouts/15/Nli.PL.HomePage/js/lib/bootstrap/dist/css/bootstrap.min.css">

Note: Observe the presence of the keyword dist within the link tag.

Here you can find a detailed discussion on Chrome browser initiated through ChromeDriver gets detected

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352