2

Is it possible to get an absolute xpath given the relative one of a dom node, using javascript?

The closest I have found refers to getting xpath for a node: Get Xpath from the org.w3c.dom.Node

Moreover, I get relative xpaths for nodes without unique identifiers. Any ideas how to work around this?

For example, for:

 <p>DUKE SOLINUS</p>
 <div>
 <p>AEGEON</p> 
 </div>

and for selected words "DUKE", and "AEGEON", I get xpaths "/p[1]", AND "/div[1]/p[1]" respectively, which when passed to document.evaluate function both evaluate to the same node "DUKE SOLINUS". So, I only have these relative xpaths, and do not have nodes. What I want is to evaluate these relative xpaths to correct nodes i.e. the two different nodes in this case (DUKE SOLINUS and AEGEON).

To be more specific, xpaths are taken from json annotation object: http://docs.annotatorjs.org/en/v1.2.x/annotation-format.html

Any help would be appreciated! Thanks a lot!

Community
  • 1
  • 1
Maksim
  • 33
  • 5
  • This might be one of those questions where, whatever elegant answer you come up with, I can build a scenario where it will fail. Assigning IDs to every single node would work, but that would suck. Why do you need the absolute xpath anyway? Perhaps there's a solution where you don't. – James Mar 30 '16 at 23:00
  • Thanks a lot for the answer! I was afraid to hear that, but I still hope there will be an answer. I have a relative path of a selected text on a page (I also have that text - not particularly useful), and I need to replace that text with some other text. As relative path I get is not always unique (I will update my question with the example), document.evaluate function will evaluate to the same node. – Maksim Mar 30 '16 at 23:27

2 Answers2

0

This function will find a cardinal absolute xpath to a given node. If the order of any nodes changes, or nodes are added or deleted, then this won't work. But for static content it should be fine.

function getXpathOfNode(domNode,bits) {
  // uses cardinality.  Will not work if any nodes are added/moved/removed in the DOM.  Fine for static stuff.
  bits = bits ? bits : [];
  var c = 0;
  var b = domNode.nodeName;
  var p = domNode.parentNode;

  if (p) {
    var els = p.getElementsByTagName(b);
    if (els.length >  1) {
      while (els[c] !== domNode) c++;
      b += "[" + (c+1) + "]";
    }
    bits.push(b);
    return getXpathOfNode(p, bits);
  }
  
  //bits.push(b);  // this is #document.  Probably dont need it.
  return bits.reverse().join("/");
}
// testing
var xp = getXpathOfNode(document.getElementById("pickme"));
var r = document.getElementById("result");
r.innerHTML = "the xpath is " + xp + "<br>";
var result =  document.evaluate(xp, document, null, XPathResult.ANY_TYPE, null);
r.innerHTML += "the chosen node contains - " + result.iterateNext().innerHTML;
<div>
<div>junk</div>
<span>irrelevant</span>
<div>pointless
<div>
<div id="pickme">this is the test node</div><span>don't read this</span></div>
<br><br>
<div id="result"></div>
</div>
James
  • 20,957
  • 5
  • 26
  • 41
  • thanks a lot, but now I am not sure if i explained properly what i need (i updated the question again :)). the function you suggested accepts a node and then works out its xpath. what i need is to get the node when having only a relative xpath. document.evaluate will not always return me a unique node (such as in the given example). that is why I thought it might be easier to get absolute xpaths from given relative ones and then use document.evaluate. – Maksim Mar 31 '16 at 10:41
  • Do you have the node to which your xpaths are relative? If it were `
    ` you could do `document.evaluate(relativeXpath, document.getElementById('content'), ...`
    – James Mar 31 '16 at 12:22
  • I am not sure if I can get it, but I will check it out. Thanks for suggestion! – Maksim Mar 31 '16 at 12:35
  • @Maksim Did you try my answer? It takes relative xpath and returns the absolute xpath. – John Yepthomi Jul 23 '23 at 16:33
0

Backstory: I am working on a project where I needed to know the relative Xpath to a target Parent Node. I modified the answer provided @James above to achieve what I wanted. Here's the code to get the relative XPath from a ParentNode (mainNode) to a child Node. Tags can be ignored when calling the function, it's used in the recursive calls to itself upon execution.

function getRealtiveXPathToChild(childNode, mainNode, Tags) {
  const mainParent = mainNode.parentNode;
  Tags = Tags ? Tags : [];
  let currTag = childNode.tagName;
  const currParent = childNode.parentNode;

  if (currParent && mainParent !== currParent) {
    var els = currParent.querySelectorAll(`:scope > ${currTag}`);

    els.forEach((el, idx) => {
      if (els.length > 1 && el === childNode) {
        currTag += "[" + (idx + 1) + "]";
      }
    });

    if(currTag) Tags.push(currTag);
    return this.getRealtiveXPathToChild(currParent, mainNode, Tags);
  }

  return Tags.reverse().join("/");
}

Now, regarding your case and the example you provided, where you only have the relative xpath and have no element nodes to work with. You wanted to get the absoute XPath from those relative xpaths that you already have. What you can do is to get the element from the relative xpath and then use that element to get its absolute xpath. Hope it helps.

/** 
* Get The Element from the relative XPath
* Add double forward slashes at the beginning of the relative XPath
*/ 
const el = getElementByXpath(`//div[1]/p[1]`);
const absolutePath = getAbsoluteXPathFromNode(el);

function getElementByXpath(path, node) {
  return document.evaluate(
    path,
    node ? node : document,
    null,
    XPathResult.FIRST_ORDERED_NODE_TYPE,
    null
  ).singleNodeValue;
}

/**
 *  The function below is from another answer on this site.
 * @ref : https://stackoverflow.com/questions/9197884/how-do-i-get-the-xpath-of-an-element-in-an-x-html-file
 */
function getAbsoluteXPathFromNode(node) {
  var comp,
    comps = [];
  var parent = null;
  var xpath = "";
  var getPos = function (node) {
    var position = 1,
      curNode;
    if (node.nodeType === Node.ATTRIBUTE_NODE) {
      return null;
    }
    for (
      curNode = node.previousSibling;
      curNode;
      curNode = curNode.previousSibling
    ) {
      if (curNode.nodeName === node.nodeName) {
        ++position;
      }
    }
    return position;
  };

  if (node instanceof Document) {
    return "/";
  }

  for (
    ;
    node && !(node instanceof Document);
    node =
      node.nodeType === Node.ATTRIBUTE_NODE
        ? node.ownerElement
        : node.parentNode
  ) {
    comp = comps[comps.length] = {};

    /*eslint default-case: "error"*/
    switch (node.nodeType) {
      case Node.TEXT_NODE:
        comp.name = "text()";
        break;
      case Node.ATTRIBUTE_NODE:
        comp.name = "@" + node.nodeName;
        break;
      case Node.PROCESSING_INSTRUCTION_NODE:
        comp.name = "processing-instruction()";
        break;
      case Node.COMMENT_NODE:
        comp.name = "comment()";
        break;
      case Node.ELEMENT_NODE:
        comp.name = node.nodeName;
        break;
      // No Default
    }
    comp.position = getPos(node);
  }
John Yepthomi
  • 472
  • 6
  • 13