0

I'm scraping a website of a university's enrollment system. Each page links to many other pages. Each link has an ID of SEC_SHORT_TITLE_x where x is an integer from 1-20. Once on each of those pages, I'd like to scrape a few pieces of data. Right now I'm just trying to scrape the section name. Will handle the logic for going back a page and clicking the next link after this.

DevTools showing xPath: enter image description here

for y in range(1):
    for j in range(1,2):
        if browser.find_elements_by_xpath("//a[@id='SEC_SHORT_TITLE_" + str(j) + "']"):
            #outputstring = ''
            browser.find_elements_by_xpath("//a[@id='SEC_SHORT_TITLE_" + str(j) + "']").click()
            time.sleep(10)
            section = browser.find_elements_by_xpath("//p[@id='VAR2']")
            print(section)

The script navigates to the proper page that contains all the links but isn't able to click on the first link as it should.

[7756:2296:0923/141749.015:ERROR:ssl_client_socket_impl.cc(941)] handshake failed; returned -1, SSL error code 1, net_error -100

Sartorialist
  • 291
  • 2
  • 18

1 Answers1

1

Based on the error message you provided (SyntaxError: Failed to execute 'evaluate' on 'Document': The string '//a[@id='SEC_SHORT_TITLE_1]' is not a valid XPath expression. (Session info: chrome=77.0.3865.90)), it looks like your XPath syntax is incorrect. You need to add a closing ' mark inside the square brackets.

Change //a[@id='SEC_SHORT_TITLE_1]

To //a[@id='SEC_SHORT_TITLE_1']

Notice how I added a single ' mark after 'SEC_SHORT_TITLE_1'.

Based on your code sample, you'll need to update this line by changing:

browser.find_elements_by_xpath("//a[@id='SEC_SHORT_TITLE_" + str(j) + "]"):

to:

browser.find_elements_by_xpath("//a[@id='SEC_SHORT_TITLE_" + str(j) + "']"):

I've added a single ' mark before your closing square bracket to correct the XPath syntax.

CEH
  • 5,701
  • 2
  • 16
  • 40
  • Apologies, I must have posted an older error message, as I had already fixed that issue. Will rerun and post new error messages. Thanks for response – Sartorialist Sep 23 '19 at 19:15
  • No problem, let me know what you find. Happy to help. – CEH Sep 23 '19 at 19:23
  • @Sartorialist This might have something to do with the website requiring SSL. You can try to bypass this by adding --ignore-certificate-errors and --ignore-ssl-errors to ChromeOptions() when you initialize the WebDriver. More information can be found here https://stackoverflow.com/questions/37883759/errorssl-client-socket-openssl-cc1158-handshake-failed-with-chromedriver-chr – CEH Sep 23 '19 at 19:33