2

I'm trying to scrape some data from LinkedIn but I noticed that the elements id change each time I load the page with Selenium. So I tried using class name to find all the elements but the class names have newline inside of them, preventing me from scraping the website.

example of class with newlines here

Website link example

I tried doing the below:

job_test = "ember-view   jobs-search-results__list-item occludable-update p0 relative scaffold-layout__list-item\n              \n              \n              "

job_list = driver.find_elements(By.CLASS_NAME, job_test)

I even tried this:

job_test = '''ember-view   jobs-search-results__list-item occludable-update p0 relative scaffold-layout__list-item
              
              
              '''
job_list = driver.find_elements(By.CLASS_NAME, job_test)

But it does not show me any elements when I print job_list. What do I do here?

uhhfeef
  • 23
  • 4

1 Answers1

2

By.CLASS_NAME accepts only one classname, so you can't pass multiple. See: Invalid selector: Compound class names not permitted error using Selenium


Solution

To create the job list you have to induce WebDriverWait for visibility_of_all_elements_located() and you can use either of the following locator strategies:

  • Using CLASS_NAME:

    driver.get('https://www.linkedin.com/jobs/search/?currentJobId=3425809260&keywords=python')
    job_list = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CLASS_NAME, "jobs-search-results__list-item")))
    
  • Using CSS_SELECTOR:

    driver.get('https://www.linkedin.com/jobs/search/?currentJobId=3425809260&keywords=python')
    job_list = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "li.jobs-search-results__list-item")))
    
  • Using XPATH:

    driver.get('https://www.linkedin.com/jobs/search/?currentJobId=3425809260&keywords=python')
    job_list = WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//li[contains(@class, 'jobs-search-results__list-item')]")))
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352