0

I am trying to crawl resumes on Indeed using the following repository https://github.com/GowthamGottimukkala/Indeed-resume-scraper

After fixing a few issues, I'm now trying to scrape using python puf.py, but am running into issues with selenium

raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//*[@id="content"]/div/div[2]/div/div[1]/div[2]/div/form/div[3]/button"}

Confirmed that my chromedriver was up to date and accurate for the browser I'm using. Any suggestions on how to resolve?

andre
  • 448
  • 1
  • 3
  • 8
  • Hi, in general the element is hidden or cant be selected, could you include the URL so others could have a look and help – Phung Duy Phong Feb 21 '20 at 02:56
  • For sure, here would be the URL I was searching for beforehand https://resumes.indeed.com/ – djscott Feb 21 '20 at 03:07
  • Hi, could you change that line to `//button[contains(.,'Find resumes')][1]` – Phung Duy Phong Feb 21 '20 at 03:16
  • It solved it by doing the following: '//button[contains(.,"Find resumes")][1]', but now having issues navigating the next section as before I had the following setup: element = WebDriverWait(driver, 5).until( EC.presence_of_element_located((By.XPATH, '//*[@id="content"]/div/div[2]/div/div[2]/div[2]/div[1]/div/div[1]/span[1]/a')) ) – djscott Feb 21 '20 at 03:52
  • what is the error then, you should read the code and understand what it is doing, as just one file and relatively short, – Phung Duy Phong Feb 21 '20 at 04:44
  • Error received is :File "puf.py", line 38, in EC.presence_of_element_located((By.XPATH, '//*[@id="content"]/div/div[2]/div/div[2]/div[2]/div[1]/div/div[1]/span[1]/a')) and selenium/webdriver/support/wait.py", line 80, in until raise TimeoutException(message, screen, stacktrace) selenium.common.exceptions.TimeoutException: Message: During handling of the above exception, another exception occurred: Traceback (most recent call last): File "puf.py", line 44, in print(len(pages)) TypeError: object of type 'NoneType' has no len() – djscott Feb 21 '20 at 05:36
  • Could you change it to `//*[@id="content"]/div/div[1]/div/div[2]/div[1]/div/div[2]/div[2]/div/div/div/span/a` – Phung Duy Phong Feb 21 '20 at 06:41

0 Answers0