Why does the XPath of some elements change sometimes?

Question

I'm working on some automated actions for Instagram using Python and Selenium and sometimes my code crashes because of a NoSuchElementException. For example, when I first wrote a function for unfollowing a user, I used something like:

following_xpath = "//*[@id='react-root']/section/main/div/header/section/div[1]/div[2]/div/span/span[1]/button"

After running a few times, my code crashed because it couldn't find the element so upon inspecting the page I found out that the XPath now is:

following_xpath = "//*[@id="react-root"]/section/main/div/header/section/div[2]/div/div/div[2]/div/span/span[1]/button"

There's a small difference in div[1]/div[2]/div to div[2]/div/div/div[2]. So I have two questions:

Why does this happen?
Is there a bulletproof method that guarantees I will always be getting the right XPath (or element)?

Not sure why the change occurs. Maybe some dynamically generated content that varies. A better way would be to find an element using some other conditions. Maybe by id, or by class + text, etc. Or finding an element close to the element you want using an id, and then using the xpath for the rest. Having such a long xpath is going to be pretty prone to failure to minor changes in the site. — fooiey, Sep 11 '20 at 00:15
Yeah it's hard sometimes to find a nice path, specially for Instagram's website where most elements don't have name or id, just crazy long ass classes. — everspader, Sep 11 '20 at 01:16
*"Yeah it's hard sometimes to find a nice path"* - You do not "find" XPath. You *write* it. By hand. After you have understood the page's structure. Yours looks like it was computer-generated, of course that will fail. — Tomalak, Sep 11 '20 at 11:30
@Tomalak Exactly, it's high time the myth needs to be busted. — undetected Selenium, Sep 11 '20 at 11:42

score 2 · Answer 1 · answered Sep 11 '20 at 08:13

The answer to (1) is simple: the page content has changed.

Firstly, the notion that there is "an XPath" for every element in a document is wrong: there are many (an infinite number) of XPath expressions that will select a given element. You've probably generated these XPaths using a tool that tries to give you what it considers the most useful XPath expression, but it's not the only one possible.

The best XPath expression to use is one that isn't going to change when the content of the page changes: but it's very hard for any tool to give you that, because it has no idea what's likely to change in the page content.

Using an @id attribute value (which these paths do) is more likely to be stable than using numeric indexing (which these paths also do), but that's based on guesses about what's likely to change, and those guesses can always be wrong. The only way of writing an XPath expression that continues to do "the right thing" when the page changes is to correctly guess what aspects of the page structure are going to vary and what parts are going to remain stable. So the only "bulletproof" answer (2) is to understand not just the current page structure, but its invariants over time.

undetected Selenium · Accepted Answer · 2020-09-11T11:51:30.427

It's high time we bust the myth that XPath changes.

Locator Strategies e.g. xpath and css-selectors are derived by the user and the more canonical the locators are constructed the more durable they are.

XML Path Language (XPath)

XPath 3.1 is an expression language that allows the processing of values conforming to the data model defined in XQuery and XPath Data Model (XDM) 3.1. The name of the language derives from its most distinctive feature, the path expression, which provides a means of hierarchic addressing of the nodes in an XML tree. As well as modeling the tree structure of XML, the data model also includes atomic values, function items, and sequences. This version of XPath supports JSON as well as XML, adding maps and arrays to the data model and supporting them with new expressions in the language and new functions in XQuery and XPath Functions and Operators 3.1.

Selectors

CSS (Cascading Style Sheets) is a language for describing the rendering of HTML and XML documents on screen, on paper, in speech, etc. CSS uses Selectors for binding style properties to elements in the document. These expressions can also be used, for instance, to select a set of elements, or a single element from a set of elements, by evaluating the expression across all the elements in a subtree.

This usecase

As per your code trials:

following_xpath = "//*[@id='react-root']/section/main/div/header/section/div[1]/div[2]/div/span/span[1]/button"

and

following_xpath = "//*[@id="react-root"]/section/main/div/header/section/div[2]/div/div/div[2]/div/span/span[1]/button"

Here are a couple of takeaways:

The DOM Tree contains React elements. So it is quite clear that the app uses ReactJS. React is a declarative, efficient, and flexible JavaScript library for building user interfaces. It lets you compose complex UIs from small and isolated pieces of code called components.
The xpaths are absolute xpaths.
The xpaths contains indexes.

So, the application is dynamic in nature and elements are liable to be added and moved within the HTML DOM on firing of any DOM events.

Solution

In such cases when the application is based on either:

The canonical approach is to construct relative and/or dynamic locators inducing WebDriverWait. Some examples:

To interact with the username field on instagram login page:

WebDriverWait(browser, 20).until(EC.element_to_be_clickable((By.CSS_SELECTOR, "input[name='username']"))).send_keys("anon")

You can find a detailed discussion in Filling in login forms in Instagram using selenium and webdriver (chrome) python OSX

To locate the first line of the address just below the text as FIND US on facebook:

WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//span[normalize-space()='FIND US']//following::span[2]")))

You can find a detailed discussion in Decoding Class names on facebook through Selenium

Interecting with GWT elements:

WebDriverWait(driver, 20).until(EC.element_to_be_clickable((By.XPATH, "//div[@title='Viewers']//preceding::span[1]//label"))).click()

You can find a detailed discussion in How to click on GWT enabled elements using Selenium and Python

Why does the XPath of some elements change sometimes?

2 Answers2

XML Path Language (XPath)

Selectors

This usecase

Solution