1

I want to build a web scraper for a builtin website.

I run the following code in jupyter notebook:

driver = webdriver.PhantomJS(executable_path="C:/Users/Downloads/phantomjs-2.1.1-windows/bin/phantomjs.exe")
driver.get("https://builtin.com/jobs")
html = driver.page_source

And I got

<noscript>
      &lt;strong&gt;
        We’re sorry but our site doesn’t work properly without JavaScript enabled. Please enable it to continue.
      &lt;/strong&gt;
    </noscript>

instead of job openings.

I tried to use webdriver.Chrome() and BeautifulSoup, I always get the same result with this website, but everything works with others.

I've run out of ideas on how to fix it. What could be the problem?

Javascript is enabled in my web browser.

I tried the code that was advised to me. It didn't work for me

rndm
  • 11
  • 3

1 Answers1

1

This error message...

<body>
  <noscript>
    <strong>
      We’re sorry but our site doesn’t work properly without JavaScript enabled. Please enable it to continue.
    </strong>
  </noscript>

...implies that possibly the co-operating useragent informed the document that it is controlled by WebDriver and subsequently an alternate code paths was be triggered during automation.


Solution

To evade the detection of Selenium driven ChromeDriver initiated Browsing Context you can use the following argument:

compatible code

Your optimum code block will be:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.chrome.service import Service

options = Options()
options.add_argument("start-maximized")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
options.add_argument('--disable-blink-features=AutomationControlled')
s = Service('C:\\BrowserDrivers\\chromedriver.exe')
driver = webdriver.Chrome(service=s, options=options)
driver.get("https://builtin.com/jobs")
print(driver.page_source)
driver.quit()

Console Output:

...
<span data-v-96c85cba="" class="sr-only">Open Search</span></button> <div data-v-96c85cba="" class="menu-item"><div data-v-11f5f671="" data-v-96c85cba=""><a data-v-fa58c90a="" data-v-11f5f671="" href="/premium/membership" class="employers">
    For Employers
  </a></div></div> <div data-v-96c85cba=""><div data-v-66b1a49a="" data-v-96c85cba="" class="profile-auth"><button data-v-66b1a49a="" class="b-oauth signup">
    Join
  </button> <span data-v-66b1a49a="" class="spl national"></span> <button data-v-66b1a49a="" class="b-oauth login">
    Log In
  </button></div></div></div></div></div></div></div> <div data-v-04d4c7c3="" class="menu-wrapper"><div data-v-04d4c7c3="" class="menu menu-desktop navigation-secondary"><ul data-v-04d4c7c3="" class="menu-container grid d-flex"><li data-v-04d4c7c3="" tabindex="0" class="menu-item"><a data-v-fa58c90a="" data-v-04d4c7c3="" href="/jobs" aria-current="page" class="menu-link nuxt-link-exact-active nuxt-link-active">
      Jobs
      <svg data-v-04d4c7c3="" aria-hidden="true" focusable="false" data-prefix="fas" data-icon="chevron-down" role="img" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 448 512" class="fa-icon svg-inline--fa fa-chevron-down fa-w-14"><path data-v-04d4c7c3="" fill="currentColor" d="M207.029 381.476L12.686 187.132c-9.373-9.373-9.373-24.569 0-33.941l22.667-22.667c9.357-9.357 24.522-9.375 33.901-.04L224 284.505l154.745-154.021c9.379-9.335 24.544-9.317 33.901.04l22.667 22.667c9.373 9.373 9.373 24.569 0 33.941L240.971 381.476c-9.373 9.372-24.569 9.372-33.942 0z" class=""></path></svg></a></li> <li data-v-04d4c7c3="" class="menu-item"><a data-v-fa58c90a="" data-v-04d4c7c3="" href="/companies" class="menu-link">Tech Companies</a></li> <li data-v-04d4c7c3="" class="menu-item"><a data-v-fa58c90a="" data-v-04d4c7c3="" href="/remote" class="menu-link">
      Remote
    </a></li> <li data-v-04d4c7c3="" tabindex="0" class="menu-item"><a data-v-fa58c90a="" data-v-04d4c7c3="" href="/tech-topics" class="menu-link">Tech Topics
      <svg data-v-04d4c7c3="" data-v-fa58c90a="" aria-hidden="true" focusable="false" data-prefix="fas" data-icon="chevron-down" role="img" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 448 512" class="fa-icon svg-inline--fa fa-chevron-down fa-w-14"><path data-v-04d4c7c3="" data-v-fa58c90a="" fill="currentColor" d="M207.029 381.476L12.686 187.132c-9.373-9.373-9.373-24.569 0-33.941l22.667-22.667c9.357-9.357 24.522-9.375 33.901-.04L224 284.505l154.745-154.021c9.379-9.335 24.544-9.317 33.901.04l22.667 22.667c9.373 9.373 9.373 24.569 0 33.941L240.971 381.476c-9.373 9.372-24.569 9.372-33.942 0z" class=""></path></svg></a></li> <li data-v-04d4c7c3="" tabindex="0" class="menu-item"><a data-v-fa58c90a="" data-v-04d4c7c3="" href="/salaries" class="menu-link">
      Salaries
      <svg data-v-04d4c7c3="" data-v-fa58c90a="" aria-hidden="true" focusable="false" data-prefix="fas" data-icon="chevron-down" role="img" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 448 512" class="fa-icon svg-inline--fa fa-chevron-down fa-w-14"><path data-v-04d4c7c3="" data-v-fa58c90a="" fill="currentColor" d="M207.029 381.476L12.686 187.132c-9.373-9.373-9.373-24.569 0-33.941l22.667-22.667c9.357-9.357 24.522-9.375 33.901-.04L224 284.505l154.745-154.021c9.379-9.335 24.544-9.317 33.901.04l22.667 22.667c9.373 9.373 9.373 24.569 0 33.941L240.971 381.476c-9.373 9.372-24.569 9.372-33.942 0z" class=""></path></svg></a></li> <li data-v-04d4c7c3="" class="menu-item push-right">
...

References

You can find a couple of relevant detailed discussion in:

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352