Html pages are just documents with elements structured like in a tree.
In general
Selenium uses element locators to find things. Locators work lazily. When you look up an element Selenium first checks if its cached. If not, it uses SearchContext which finds all elements within the current context (eg. DOM element) using a given mechanism, for example by XPathEvaluator.
SearchContext runs findElement() if you are looking for one element or findElements() if you are looking for more than one.
In simple terms, findElement() tries to run JavaScript script to find the element asynchronously. If it can’t, it tries to find it directly by using an interestingly called method – xpathWizardry
, i.e. by using XPathEvaluator evaluation.
XPath
When you use XPath (XML Path Language) in Selenium, this is just a way to navigate through hierarchical structure of an XML-like document, such as html.
XPath uses a non-XML syntax to provide a flexible way of pointing to different parts of an XML document.
Internally selenium uses W3 XPathEvaluator, which evaluates XPath expressions.
You can study XPathEvaluator source code here.