-1

I am trying to crawl the product IDs in a website, while the ID locate in the product container. It return no such element:

driver.find_element_by_class_name('na finger product-container')

HTML:

<div data-v-8cd3b522="" na-element="item" na-module="product" na-data="{“ids_value”:[“#300#”],”ids_key”:”item_type_ids”,”item_type_ids”:[“#300#”],”tag_ids”:[“#85#”],”app_page_section_id”:10103,”app_page_id”:10011}" class="na finger product-container" na-id="9cd1a902-57aa-48d4-b79c-2bca03d21b83">

The attribute I wanted to retrieve is the ids_value which is #300# in this case.

the website: https://www.gratus.com.hk/product/

the html example

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
RWWW
  • 35
  • 4

4 Answers4

0

If you want to use @class as identifier try search by single class name

driver.find_element_by_class_name('product-container')

or by CSS selector

driver.find_element_by_css_selector('.na.finger.product-container') 

You might also need to implement Implicit/Explicit wait as required data seem to be dynamic

JaSON
  • 4,843
  • 2
  • 8
  • 15
  • can locate the individual product div using "driver.find_element_by_css_selector('.na.finger.product-container')", but how to access to {"ids_value":"300"} inside the "na-data"? – RWWW Nov 16 '20 at 09:33
  • .get_attribute('na-data') then maybe parsing. – Arundeep Chohan Nov 16 '20 at 09:37
0

To parse out the tag.

import json
elem=driver.find_element_by_css_selector('.na.finger.product-container').get_attribute('na-data')
myJSON = json.loads(elem)
print(myJSON["ids_value"])

Outputs

['#290#']       
Arundeep Chohan
  • 9,779
  • 5
  • 15
  • 32
0

To extract the ids_value e.g. 290, 291 etc, from all of the <div> using Selenium and you have to induce WebDriverWait for visibility_of_all_elements_located() and you can use either of the following Locator Strategies:

  • Using CSS_SELECTOR:

    print([my_elem.get_attribute("na-data").split("#")[1] for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.CSS_SELECTOR, "div[na-element='item'][na-module='product']")))])
    
  • Using XPATH:

    print([my_elem.get_attribute("na-data").split("#")[1] for my_elem in WebDriverWait(driver, 20).until(EC.visibility_of_all_elements_located((By.XPATH, "//div[@na-element='item' and @na-module='product']")))])
    
  • Console Output:

    ['290', '291', '292', '295', '296', '297', '298', '299', '300', '302', '303', '305', '306', '307', '308', '309']
    
  • Note : You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
-1

driver.find_element_by_css_selector('.na.finger.product-container').get_attribute('na-data')

return

{"ids_value":["#290#"],"ids_key":"item_type_ids","item_type_ids":["#290#"],"tag_ids":["#4#"],"app_page_section_id":10103,"app_page_id":10011}

thanks everyone!

RWWW
  • 35
  • 4