0

I am trying to download web pages using python selenium.
There is a tree view on the left side and the content on the right side.

This is HTML of treeview. Of course, all sub menus are closed at first.

<ul>
  <li>
    <a href="#" onclick="openSubMenu()">item1</a>
    <ul>
      <li>
        <a href="./item2.html">item2</a>
      </li>
      <li>
        <a href="#" onclick="openSubMenu()">item3</a>
        <ul>
          <li>
            <a href="./item4.html">item4</a>
          </li>
          <li>
            <a href="#" onclick="openSubMenu()">item5</a>
            <ul>
              <li>
                <a href="./item6.html">item6</a>
              </li>
            </ul>
          </li>
        </ul>
      </li>
      <li>
        <a href="#" onclick="openSubMenu()">item7</a>
        <ul>
          <li>
            <a href="./item8.html">item8</a>
          </li>
        </ul>
      </li>
    </ul>
  </li>
  <li>
    <a href="#" onclick="openSubMenu()">item9</a>
    <ul>
      <li>
        <a href="./item10.html">item10</a>
      </li>
    </ul>
  </li>
  <li>
    <a href="#" onclick="openSubMenu()">item11</a>
    <ul>
      <li>
        <a href="./item11.html">item12</a>
      </li>
    </ul>
  </li>
</ul>

When I click an item, if it has a page link, it is linked to the right's iframe tag, if not, opens the sub-menu.

I used tree recursion to open all sub-menus.

def tree_recursion(self, tree_container):
    tree_branches = tree_container.find_elements(By.XPATH, './li')
    for tree_branch in tree_branches:
      time.sleep(0.5)
      tree_branch.find_element(By.XPATH, './a').click()
      try:
        new_tree = tree_branch.find_element(By.XPATH, './ul')
        if new_tree:
          tree_recursion(new_tree)
      except:
        continue

But it didn't work, Following error occurred.

File "...\run.py", line 105, in tree_recursion
    tree_branch.find_element(By.XPATH, './a').click()
  File "...\AppData\Local\Programs\Python\Python310\lib\site-packages\selenium\webdriver\remote\webelement.py", line 433, in find_element
    return self._execute(Command.FIND_CHILD_ELEMENT, {"using": by, "value": value})["value"]
  File "...\AppData\Local\Programs\Python\Python310\lib\site-packages\selenium\webdriver\remote\webelement.py", line 410, in _execute
    return self._parent.execute(command, params)
  File "...\AppData\Local\Programs\Python\Python310\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 444, in execute
    self.error_handler.check_response(response)
  File "...\AppData\Local\Programs\Python\Python310\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 249, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.StaleElementReferenceException: Message: stale element reference: element is not attached to the page document
  (Session info: chrome=109.0.5414.75)
Stacktrace:
Backtrace:
        (No symbol) [0x00B66643]
        (No symbol) [0x00AFBE21]
        (No symbol) [0x009FDA9D]
        (No symbol) [0x00A009E4]
        (No symbol) [0x00A008AD]
        (No symbol) [0x00A00B30]
        (No symbol) [0x00A30FAC]
        (No symbol) [0x00A3147B]
        (No symbol) [0x00A264C1]
        (No symbol) [0x00A4FDC4]
        (No symbol) [0x00A2641F]
        (No symbol) [0x00A500D4]
        (No symbol) [0x00A66B09]
        (No symbol) [0x00A4FB76]
        (No symbol) [0x00A249C1]
        (No symbol) [0x00A25E5D]
        GetHandleVerifier [0x00DDA142+2497106]
        GetHandleVerifier [0x00E085D3+2686691]
        GetHandleVerifier [0x00E0BB9C+2700460]
        GetHandleVerifier [0x00C13B10+635936]
        (No symbol) [0x00B04A1F]
        (No symbol) [0x00B0A418]
        (No symbol) [0x00B0A505]
        (No symbol) [0x00B1508B]
        BaseThreadInitThunk [0x7607FA29+25]
        RtlGetAppContainerNamedObjectPath [0x777D7A9E+286]
        RtlGetAppContainerNamedObjectPath [0x777D7A6E+238]

I've tried to solve this problem, but I didn't find any solution for it because it needs dynamic selector in three recursion function.

What is the best solution for the dynamic selector?

Or any other way to scrap this?

grudev
  • 485
  • 1
  • 6
  • 15

1 Answers1

0

As you mentioned 'all sub menus are closed at first'. So everytime you invoke the click() method, sub-menus becomes visible i.e. the HTML DOM changes and more menu items gets visible within the DOM Tree.

After each click as the HTML DOM changes but your program keeps on referring the previously found elements. As example, you program still tries to refer to tree_branch even after you have invoked click. Hence you see the error StaleElementReferenceException.

You can find a couple of relevant detailed discussion on StaleElementReferenceException in:

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
  • Thank you @undetected-selenium. You helped me to understand `StaleElementReferenceException`, and I solved the problem. It works well now. `for idx in range(len(tree_branches)): WebDriverWait(self.browser, 30).until(ec.visibility_of_element_located((By.XPATH, xpath + '/li[' + str(idx + 1) + ']' + '/a'))).click()` variable `xpath` is the xpath of current node. https://stackoverflow.com/a/34540096/14551577 Thank you again. – grudev Jan 18 '23 at 03:01