1

I'm trying to pull a specific number out of a div class in Python Selenium but can't figure out how to do it. I'd want to get the "post_parent" ID 947630 as long as it matches the "post_name" number starting 09007.

I'm looking to do this across multiple "post_name" classes, so I'd feed it something like this: search_text = "0900766b80090cb6", but there will be multiple in the future so it has to read the "post_name" first then pull the "post_parent" if that makes sense.

Appreciate any advice anyone has to offer.

    <div class="hidden" id="inline_947631">
    <div class="post_title">Interface Converter</div>
    <div class="post_name">0900766b80090cb6</div>
    <div class="post_author">28</div>
    <div class="comment_status">closed</div>
    <div class="ping_status">closed</div>
    <div class="_status">inherit</div>
    <div class="jj">06</div>
    <div class="mm">07</div>
    <div class="aa">2001</div>
    <div class="hh">15</div>
    <div class="mn">44</div>
    <div class="ss">17</div>
    <div class="post_password"></div>
    <div class="post_parent">947630</div>
    <div class="page_template">default</div>
    <div class="tags_input" id="rs-language-code_947631">de</div>
    </div>
KunduK
  • 32,888
  • 5
  • 17
  • 41
Stuquan
  • 57
  • 6

3 Answers3

1

If you see <div class="post_name">0900766b80090cb6</div> this and <div class="post_parent">947630</div> are siblings nodes to each other.

You can use xpath -> following-sibling like this:

Code:

search_text = "0900766b80090cb6"
post_parent_num = driver.find_element(By.XPATH, f"//div[@class='post_name' and text()='{search_text}']//following-sibling::div[@class='post_parent']").text
print(post_parent_num)

or Using ExplicitWait:

search_text = "0900766b80090cb6"
post_parent_num = WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, f"//div[@class='post_name' and text()='{search_text}']//following-sibling::div[@class='post_parent']"))).get_attribute('innerText')
print(post_parent_num)

Imports:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC

Update:

NoSuchElementException:

Please check in the dev tools (Google chrome) if we have unique entry in HTML-DOM or not.

xpath that you should check :

//div[@class='post_name' and text()='0900766b80090cb6']//following-sibling::div[@class='post_parent']

Steps to check:

Press F12 in Chrome -> go to element section -> do a CTRL + F -> then paste the xpath and see, if your desired element is getting highlighted with 1/1 matching node.

If this is unique //div[@class='post_name' and text()='0900766b80090cb6']//following-sibling::div[@class='post_parent'] then you need to check for the below conditions as well.

  1. Check if it's in any iframe/frame/frameset.

    Solution: switch to iframe/frame/frameset first and then interact with this web element.

  2. Check if it's in any shadow-root.

    Solution: Use driver.execute_script('return document.querySelector to have returned a web element and then operates accordingly.

  3. Make sure that the element is rendered properly before interacting with it. Put some hardcoded delay or Explicit wait and try again.

    Solution: time.sleep(5) or

    WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//div[@class='post_name' and text()='0900766b80090cb6']//following-sibling::div[@class='post_parent']"))).text

  4. If you have redirected to a new tab/ or new windows and you have not switched to that particular new tab/new window, otherwise you will likely get NoSuchElement exception.

    Solution: switch to the relevant window/tab first.

  5. If you have switched to an iframe and the new desired element is not in the same iframe context then first switch to default content and then interact with it.

    Solution: switch to default content and then switch to respective iframe.

cruisepandey
  • 28,520
  • 6
  • 20
  • 38
  • Hey @cruisepandey, thanks for the response. I have tried running this through my script, however it doesn't print out at the end. I am receiving no errors either which is strange, any thoughts? – Stuquan Mar 28 '22 at 10:29
  • may be some time delay issue, could you run the code that I have given under `ExplicitWait` or else first code with some `time.sleep(5)` – cruisepandey Mar 28 '22 at 10:30
  • I have ran that now and an error has appeared, however it's not very descriptive it just says : 'TimeoutException: Message:'. I'm thinking maybe that class is hidden? – Stuquan Mar 28 '22 at 10:35
  • Yes could be, you will get the exact error if you run the first code and put `time.sleep(5)` before the `driver.find_element` command. – cruisepandey Mar 28 '22 at 10:36
  • so I've jiggled it around a bit and I'm getting the 'NoSuchElementException' error now, so it seems it may be hidden. I'm hopeful that there's a way to unhide it, do you know of any resources I can look at? – Stuquan Mar 28 '22 at 10:52
  • to solve `NoSuchElementException`, I'd encourage you to check the Update section above. You should debug point by point. – cruisepandey Mar 28 '22 at 10:55
1

I don't see any specific relation between "post_parent" ID 947630 and "post_name" number starting 09007. Moreover, the parent <div> is having class="hidden".

However, to pull the specific number you can use either of the following locator strategies:

  • Using css_selector:

    print(driver.find_element(By.CSS_SELECTOR, "div[id^='inline'] div.post_parent").text)
    
  • Using xpath:

    print(driver.find_element(By.XPATH, "//div[starts-with(@id, 'inline_')]//div[@class='post_parent']").text)
    

Ideally you need to induce WebDriverWait for the presence_of_element_located() and you can use either of the following locator strategies:

  • Using CSS_SELECTOR:

    print(WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.CSS_SELECTOR, "div[id^='inline'] div.post_parent"))).text)
    
  • Using XPATH:

    print(WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.XPATH, "//div[starts-with(@id, 'inline_')]//div[@class='post_parent']"))).text)
    
  • Note: You have to add the following imports :

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
0

You can create a method and use the following xpath to get the post_parent text based on post_name text.

def getPostPatent(postname):
    element=driver.find_element(By.XPATH,"//div[@class='post_name' and starts-with(text(),'{}')]/following-sibling::div[@class='post_parent']".format(postname))
    print(element.get_attribute("textContent"))

getPostPatent('09007') 

This will return value if it is matches the text starts-with('09007')

It seems parent class is hidden you need to use textContent to get the value.

KunduK
  • 32,888
  • 5
  • 17
  • 41