2

When using Selenium, I often use driver.find_element(By.XPATH, ) rather than driver.find_element(By.CSS_SELECTOR, ). I find it easy to copy the XPATH rather than understanding the HTML structure of the website.

But I had a little problem. Recently I noticed that most of my scripts using XPATH don't work because the XPATHtends to change. Is their a way to fix this problem? And is there a difference between xpath and full xpath?

devCharaf
  • 242
  • 2
  • 12
  • Correct me if I am wrong, but I guess the ```CSS_SELECTOR``` may also change, depending on the website owner, but maybe less likely than the ```XPATH``` – devCharaf Feb 01 '22 at 15:22
  • @jaSON It does answer the second part of the question thanks! – devCharaf Feb 01 '22 at 19:58

2 Answers2

2

This is a basic problem with screen scraping. The information on an HTML page is designed for human users, not for software access, and it will change over time based on the perceived needs of human users, ignoring the needs of screen scrapers.

You haven't said what you're using Selenium for. The two main users are (a) software testing (to check that your software is displaying the screen correctly) and (b) scraping data from third-party web sites. The strategy for solving the problem is different for the two cases.

For testing, try to test as much of the functionality of your application as possible using unit tests that don't rely on looking at the HTML; only look at the HTML where you actually need to test the user interface. For those tests, you're going to have to face the fact that when the HTML changes, the tests have to change.

For extracting data from third-party web sites, use a published API to the data in preference to screen-scraping if you possibly can - even if you have to pay for access, it will be cheaper in the long run. Scraping the data off HTML pages is inefficient and it leaves you completely exposed to unannounced changes the screen appearance.

Having said that, there are ways of writing XPath that make it more resilient to such changes. But only if you guess correctly what aspects of the page are likely to change, and what's likely to remain stable. It's not a difference between "xpath" and "full xpath" as you suggest, rather there are different ways of writing XPath expressions to make them resilient to changes in the HTML. Clearly for example //tr[td[1]='London']/td[2] is more likely to keep working than //div[3]/div[1]/table[9]/tbody/tr[43]/td[2].

But the best advice is, if you want to write an application that's resilient to change, steer clear of screen scraping entirely.

Michael Kay
  • 156,231
  • 11
  • 92
  • 164
1

You have to learn how to create correct locators.
Automatically generated XPath or CSS Selector locators are extremely fragile. This is making them almost useless.
Again, both automatically created XPath and CSS Selector locators.
Creating good locators will make your code much more stable but still, any Selenium based code needs maintenance after changes involved by FrontEnd developers since they are changing the page structure and elements on the page.
As about XPaths, generally there are relative and absolute XPaths.
Absolute XPath defines a full and the explicit path from the page top to the specific element node.
While relative XPath defines some short unique locator for some element node.

Prophet
  • 32,350
  • 22
  • 54
  • 79
  • 1
    Then what would be a good locator? Is there a general good locater? or is it more case based? – devCharaf Feb 01 '22 at 15:35
  • There is no general good locator. In the simplest case you will have to see the element attributes and find some attribute having an unique value. Like unique class name or a combination of class names making this element unique etc. Very often it will be some more complicated combination like some parent element with specific tag name and a combination of class names or `src` or `test-id` or any other attributes AND some tag name and attribute values of the child element making this dependency unique. – Prophet Feb 01 '22 at 15:40
  • But this still will be a dependence between 2 elements, not explicit path defined by 15 element so that any change there will brake your locator – Prophet Feb 01 '22 at 15:41
  • Just Google `how to create good xpath locators` it will give you several tutorials – Prophet Feb 01 '22 at 15:47