0
driver.find_element_by_xpath("very long xpath").click()

I have a number of these "find an element and interact with it" actions as I am trying to automate a task on a browser, what are some ways to store the long XPath locators in a variable to keep track of where the code is in terms of the task and just for aesthetics of the code?

paths:

/html/body/main/div[2]/div[6]/div[3]/div[4]/div/div[2]/table/tbody/tr[1]/td[2]
/html/body/main/div[2]/div[6]/div[3]/div[4]/div/div[2]/table/tbody/tr/td/ul/li[1]/h3/a[3]

Thank you!

JeffC
  • 22,180
  • 5
  • 32
  • 55
plx23
  • 89
  • 6
  • 1
    Unless, you post an example of an xpath it would be very difficult to suggest changes. There are lot of ways available for example- relative xpaths, predicates which you can try out. – Vishal Gada Jan 28 '20 at 07:46
  • 1
    It depends on the html, there is no global way to shorten all existing xpath. – Guy Jan 28 '20 at 07:47
  • Added the actual paths – plx23 Jan 28 '20 at 07:51
  • Learn basics of html/css and javascript to better grip your web automation with python. how you can simplify is to simply add xpaths to a list or variable to ease up passing them around wherever needed. – ReaperK0v Jan 28 '20 at 08:14
  • Your Location Path relies heavily on position predicates to make it select unique nodes. That's the most common approach when you know nothing about the document semantic (identifiers, order of appearance in the document, etc.) – Alejandro Jan 28 '20 at 14:36
  • Instead of providing absolute XPath here, image/screenshot of your inspector tab will help us to give you more suggestions. – Sathish Jan 29 '20 at 15:45
  • I guess what I want to ask is can you take that path, ID, class name etc. of an element and give it a variable name? And use that simple var name to make it easier for me to see where the code is and where it's not working? Or is that not a possibility? Again sorry if it's a dumb question but I am very new to coding :) – plx23 Jan 29 '20 at 17:03

3 Answers3

3

You can assign a locator to a variable. One example in python would be

loginButtonLocator = (By.ID, 'login')

You can then use that to locate an element like

driver.find_element(loginButtonLocator)

For your specific XPath example, you can use

reallyLongXpathLocator = (By.XPATH, '/html/body/main/div[2]/div[6]/div[3]/div[4]/div/div[2]/table/tbody/tr[1]/td[2]')

and use it like

driver.find_element(reallyLongXpathLocator)

See Locating Elements for more info.

As others have suggested, a really long locator or an absolute locator (both of which apply to the XPaths you posted) are fragile. The smallest of changes to the HTML structure will cause your locators to break and you will have to recreate them. Learning how to handcraft your own locators is something you should look into. There are lots of blogs and articles on the web that can help you there with a little googling.

I would suggest that you look into page objects. They make writing automation (and storing locators, etc.) much easier and more organized. See Page Objects for more info.

JeffC
  • 22,180
  • 5
  • 32
  • 55
1

No perfect answer here since it depends a lot of the web page you're tring to extract. These one should be "the safest" to use in your case assuming there's only one table :

//tbody/tr[1]/td[2]
//tbody//li[1]/h3/a[3]
E.Wiest
  • 5,425
  • 2
  • 7
  • 12
-1

There is no one step solution to shorten or simplify a . The real challenge is to construct relative xpath i.e. in other words, convert absolute xpath into relative xpath.


Demonstration

As an example, we will demonstrate to write a relative xpath for the <input> element with text as A submit button within Widget Buttons section, on the page https://demoqa.com/button/ which is as follows:

input_element


Steps

You need to follow the following steps:

  1. Open the url https://demoqa.com/button/ in Google Chrome browser.
  2. Press F12 or Shift + Ctrl + I to open the .
  3. Within Elements tab, click on the element Inspector tool:

element_inspector

  1. Mouse Hover over the desired element and the element gets highlighted within the DOM Tree

desired_element

  1. Right Click on the element within the HTML, select Copy and select Copy XPath

copy_xpath

  1. You will get the absolute xpath as:

    //*[@id="content"]/div[2]/div/input
    
  2. To construct the relative xpath, press Ctrl + F to open the Search Box.

html_search_box

  1. Now, to start writing the xpath, as the element is a <input> element, you need to search for the <input> tags first as follows:

    //input
    
  2. The console will find all the <input> tags and return you the number at the right side bottom corner as 1 of 2 which implies, 2 <input> tags were found and the first matched element will be highlighted in the DOM Tree, as well as on the webpage.

input_tags

  1. So, there are total 2 <input> tags on the webpage and now we have to use the element attributes to identify the desired element uniquely.
  2. As our desired element had the attribute value set as A submit button, you need to add the attribute value to your search criteria as follows:

    //input[@value='A submit button']
    
  3. You will observe even after adding the value attribute, the element can't be uniquely identified.

  4. To identify the element uniquely, you need to traverse up the HTML a bit and add the ancestor <div> element <div class="widget"> at the beginning of our current xpath as below:

    //div[@class='widget']//input[@value='A submit button']
    

unique_element

  1. This XPath identifies our desired element uniquely within the HTML and is a perfect example of relative
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
  • No. `//*[@id="content"]/div[2]/div/input` is an absolute location path the same as `//div[@class='widget']//input[@value='A submit button']`. It means that both traverse the document from the root. – Alejandro Jan 28 '20 at 14:41
  • @Alejandro Neither of your examples are absolute paths. An absolute path starts from `/html`. In a relative XPath, the *search* starts from the root but it's not an absolute XPath. You can do some googling and find more info but [here's a link](https://stackoverflow.com/questions/27183353/what-is-the-difference-between-absolute-and-relative-xpaths-which-is-preferred) that has some more info. – JeffC Jan 29 '20 at 22:57
  • @JeffC Don't take for granted my words. Learn from the specs itself https://www.w3.org/TR/1999/REC-xpath-19991116/#NT-AbsoluteLocationPath . Bottom line: a location path is an absolute location path when is _deterministic_, it doesn't depend of context node. – Alejandro Jan 30 '20 at 13:24
  • @Alejandro From your own link... `An absolute location path consists of / optionally followed by a relative location path. A / by itself selects the root node of the document containing the context node.` So what I said, `An absolute path starts from /html` fits that definition exactly. Neither of your examples start with `/` therefore are *not* absolute so they must be relative. – JeffC Jan 30 '20 at 14:29
  • @JeffC This is my last comment. Please read https://www.w3.org/TR/1999/REC-xpath-19991116/#NT-AbbreviatedAbsoluteLocationPath – Alejandro Jan 30 '20 at 14:56
  • @Alejandro I think it's confusing because your link is pointing to an AbbreviatedAbsoluteLocationPath not an AbsoluteLocationPath. The bottom of Section 2.0 is where absolute and relative are defined (what I quoted from), you will see that, `AbsoluteLocationPath ::= '/' RelativeLocationPath? | **AbbreviatedAbsoluteLocationPath**`. You can have a "full" absolute path like `/html/table/tr/td/a[@id='abc']` or an *abbreviated* absolute path like `/html//td/a[@id='abc']`. They are both absolute because they both start at `/html` but the 2nd is abbreviated because it uses `//` to skip levels. – JeffC Jan 30 '20 at 16:49