0

I currently have a Selenium script that is running through a list of part numbers on a website and capturing some information such as product name (pulled from the page title).

I have noticed that some of the product is identified as "DISCONTINUED" (through a span) and so I would like to be able to capture that information so that I can ignore all of those products.

On the website in general, they denote these products through:

<span data*="">DISCONTINUED</span>

Any other products that are valid will not have this information on it, and so in this case I want to ensure that the script doesn't crash and just captures a blank value.

I tried using:

driver.find_element(By.XPATH, '//html/body/div/div/section/div/section/div/div/section/div/div/div/div/div/span/span[text()="DISCONTINUED"]')

However I get this error:

selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//html/body/div/div/section/div/section/div/div/section/div/div/div/div/div/span/span[text()="DISCONTINUED"]"}

I did make sure to use only a single part number that did indeed have a DISCONTINUED on it.

I also did load up the page in dev tools and searched for that specific xpath and it did indeed highlight the proper section:

Searched:

//html/body/div/div/section/div/section/div/div/section/div/div/div/div/div/span/span

Highlighted:

<span data*="">DISCONTINUED</span>

What is the best way to do this? Also, I need to check to see how to include a blank value by default so that it will not crash when a current product is retrieved.

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
spitey
  • 31
  • 2
  • `(By.XPATH, '//html/body/div/div/section/div/section/div/div/section/div/div/div/div/div/span/span[text()="DISCONTINUED"]')` are you trying to locate the element with text _`DISCONTINUED`_ – undetected Selenium Feb 07 '23 at 21:08
  • I am outputting the dataframe to an excel file with the information that is pulled from Selenium. So, I export out the part number and title. I would like to add whether or not it is Discontinued. That way any that are discontinued I can just remove from the data. – spitey Feb 07 '23 at 21:51

2 Answers2

0

Considering the HTML:

<span data*="">DISCONTINUED</span>

To locate the element with the text DISCONTINUED you can use either of the following locator strategies:

  • Using xpath and text():

    element = driver.find_element(By.XPATH, "//span[text()='DISCONTINUED']")
    
  • Using xpath and contains():

    element = driver.find_element(By.XPATH, "//span[contains(., 'DISCONTINUED')]")
    

Ideally to locate the element you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following locator strategies:

  • Using xpath and text():

    element = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//span[text()='DISCONTINUED']")))
    
  • Using xpath and contains():

    element = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//span[contains(., 'DISCONTINUED')]")))
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
  • The first strategy was what I was trying and it didn't work. However, the second one using webdriverwait does yield an interesting result when there is a discontinued product such as However it will crash if the element doesn't exist... – spitey Feb 08 '23 at 00:38
  • @spitey _`I did make sure to use only a single part number that did indeed have a DISCONTINUED on it.`_: This answer address this concern. Nothing more or less. I can suggest n nos of improvements in the logic and the answer can be never ending. – undetected Selenium Feb 08 '23 at 00:48
0

This is what I used to make it work with error avoidance.

try:
    discontinued = WebDriverWait(driver, 2).until(EC.visibility_of_element_located((By.XPATH, "//html/body/div/div/section/div/section/div/div/section/div/div/div/div/div/span/span[text()='ARCHIVED']")))
    discontinued = "CURRENT"
except:
    discontinued="DISCONTINUED"

Thanks again for the guidance on using waits instead of the find_element approach!

spitey
  • 31
  • 2