-1

I just need help for scrape Amazon url of image/picture on product page (first image, big size in screen), in python with selenium. For example, this product: https://www.amazon.fr/dp/B07CG3HFPV/ref=cm_sw_r_fm_api_glt_i_2RB9QBPTQXWJ7PQQ16MZ?_encoding=UTF8&psc=1

enter image description here

Here is the part of source code web page: enter image description here

I need to scrape url image with tag "src".

Anyone know how to scrape this please? Actually, I have this script part, but don't work:

url = https://www.amazon.fr/dp/B07CG3HFPV/ref=cm_sw_r_fm_api_glt_i_2RB9QBPTQXWJ7PQQ16MZ?_encoding=UTF8&psc=1

options = Options()
options.headless = True

driver = webdriver.Chrome(options=options)
driver.get(url)
import time
time.sleep(2)

actions = ActionChains(driver)

link_img = driver.find_element_by_tag_name("img").get_attribute("src")

Thanks for help

JohnDoe
  • 95
  • 7
  • You need to find a pattern for that image location in the dom, or the class names or the id, or ... no easy way around it. Your code obviously does not work because you can be sure there is more than one image on that page. – luk2302 Nov 05 '21 at 21:16
  • You can make a way, or an example for to scrap this url please? I've tested more xpath, css selector, tag, but no way out – JohnDoe Nov 05 '21 at 21:25

1 Answers1

2

To scrape the amazon url of image/picture on product page (first image, big size in screen), in python with selenium you need to induce WebDriverWait for the visibility_of_element_located() and you can use either of the following Locator Strategies:

  • Using CSS_SELECTOR:

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "span.a-list-item>span.a-declarative>div.imgTagWrapper>img.a-dynamic-image"))).get_attribute("src"))
    
  • Using XPATH:

    print(WebDriverWait(browser, 20).until(EC.visibility_of_element_located((By.XPATH, "//span[@class='a-list-item']/span[@class='a-declarative']/div[@class='imgTagWrapper']/img[@class='a-dynamic-image']"))).get_attribute("src"))
    
  • Note: You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352