I am trying to gather some information from a website, using selenium. I am interested in some information (img) within div element:
<div class="entry-content clearfix">
...
<img data-attachment-id="7677" data-permalink="https://test_site.com/leftcentre/" ... alt="Example of site" >
<img data-attachment-id="98231" data-permalink="https://test_site.com/high/" ... alt="another site" >
The values of the img data-attachment-id
may change: so I could have 7677, 7664 and other values. This means that I could have the following Xpaths among many many others:
-
//*[@id="post-63779"]/div/h2[1]/img[1]
-
//*[@id="post-781"]/div/header/h1/a/img
What I have done so far to extract this information is shown below:
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import time
driver=webdriver.Chrome('my_path')
response=driver.get('https://website)
wait = WebDriverWait(driver, 20)
x = driver.find_element_by_xpath('//*[@id="post-781"]/div/header/h1/a/img').text
# print(x)
return x
but probably I am making some mistakes since I have no outputs and chrome still continues to look for the element. I am wondering if there might be a chance to get the image without explicitly referring to the post number or elements in between div and img, or just to extract all img data-attachment-id information. In case my question or the path is not clear, please let me know and I will provide you with more info.