1

I'm not sure if "lowest common ancestor" is the right term, I also think that this problem should be quite common, I have tried to find the solutions online, but couldn't find it.

So I have below structure:

<div> <!-- A -->
  <div> <!-- B -->
    <div> <!-- C: I need to select this element -->
      <div>
        <div>
          <div>
            random string
            <div>
              <div>
                SOMETHING
              </div>
              <div>
                SOMETHING
              </div>
            </div>
          </div>
        </div>
      </div>
      <div>
        <div>
          <div>
            SOMETHING
          </div>
        </div>
      </div>
      <div>
        <div>
          <div>
            SOMETHING
          </div>
        </div>
      </div>
      <div>
        <div>
          <div>
            SOMETHING
          </div>
        </div>
      </div>
      <div>
        <div>
          <div>
            SOMETHING
          </div>
        </div>
      </div>
    </div>
    <div>
      random string
    </div>
  </div>
  <div>
    random string
  </div>
</div>

My goal is to select that first/lowest element (in this case it's div C) that contains all children/descendants that contain string "SOMETHING".

The closest solution that I got was using xpath: //*[contains(text(),"SOMETHING")]/ancestor::*, but using this will return basically any elements that contain "SOMETHING" (it does return the div C, but also returns other elements, I only want to get the div C).

The solution doesn't have to be using xpath, but vanilla javascript is preferrable, also it doesn't have to be very efficient. Thanks in advance.

Damzaky
  • 6,073
  • 2
  • 11
  • 16

3 Answers3

1

By selecting all text nodes, you can then iterate through their ancestors and keep only the one(s) that exist for all of them.

function nativeTreeWalker() {
    var walker = document.createTreeWalker(
        document.body, 
        NodeFilter.SHOW_TEXT, 
        null, 
        false
    );

    var node;
    var textNodes = [];

    while(node = walker.nextNode()) {
        textNodes.push(node);
    }
    return textNodes;
}


const nodes = nativeTreeWalker()
  .filter(textNode => textNode.textContent.includes('SOMETH\ING'));
const getAncestors = elm => {
  const set = new Set();
  while (elm) {
    set.add(elm);
    elm = elm.parentElement;
  }
  return set;
};
const ancestors = nodes.map(getAncestors);
const innermostExistingInAll = [...ancestors[0]].find(
  possibleParent => ancestors.every(set => set.has(possibleParent))
);
console.log(innermostExistingInAll);
<div> <!-- A -->
  <div> <!-- B -->
    <div id="c"> <!-- C: I need to select this element -->
      <div>
        <div>
          <div>
            random string
            <div>
              <div>
                SOMETHING
              </div>
              <div>
                SOMETHING
              </div>
            </div>
          </div>
        </div>
      </div>
      <div>
        <div>
          <div>
            SOMETHING
          </div>
        </div>
      </div>
      <div>
        <div>
          <div>
            SOMETHING
          </div>
        </div>
      </div>
      <div>
        <div>
          <div>
            SOMETHING
          </div>
        </div>
      </div>
      <div>
        <div>
          <div>
            SOMETHING
          </div>
        </div>
      </div>
    </div>
    <div>
      random string
    </div>
  </div>
  <div>
    random string
  </div>
</div>
CertainPerformance
  • 356,069
  • 52
  • 309
  • 320
  • seems like this does answer my question, however I think this solution seems to be so complicated for such a simple problem, not sure if it could be improved myself though (just thought that it could be simpler). For now I will upvote your answer, if there are no other solutions that are better than this, I will accept this answer later, thanks – Damzaky Oct 26 '22 at 04:28
  • Most of it is boilerplate - there are just some things (selecting text nodes, selecting all ancestors) that there aren't built-in methods for, that you have to implement yourself. Take that away, and you're left with 3 lines of code if you inline the array callbacks (depends on your linting preferences) - which is close to the best you can get, I'd think – CertainPerformance Oct 26 '22 at 04:38
1

XPath 3.1 can express it declaratively:

let $text-nodes := //text()[contains(., 'SOMETHING')]
return innermost(//*[every $text in $text-nodes satisfies descendant::text() intersect $text])

XPath 3.1 is supported in the browser through the SaxonJS library from Saxonica, documented at https://www.saxonica.com/saxon-js/documentation2/index.html.

Example use

const htmlSnippet = `<div> <!-- A -->
  <div> <!-- B -->
    <div> <!-- C: I need to select this element -->
      <div>
        <div>
          <div>
            random string
            <div>
              <div>
                SOMETHING
              </div>
              <div>
                SOMETHING
              </div>
            </div>
          </div>
        </div>
      </div>
      <div>
        <div>
          <div>
            SOMETHING
          </div>
        </div>
      </div>
      <div>
        <div>
          <div>
            SOMETHING
          </div>
        </div>
      </div>
      <div>
        <div>
          <div>
            SOMETHING
          </div>
        </div>
      </div>
      <div>
        <div>
          <div>
            SOMETHING
          </div>
        </div>
      </div>
    </div>
    <div>
      random string
    </div>
  </div>
  <div>
    random string
  </div>
</div>`;

var searchText = 'SOMETHING';

const htmlDoc = new DOMParser().parseFromString(htmlSnippet, 'text/html');

const xpathResult = SaxonJS.XPath.evaluate(
  `let $text-nodes := //text()[contains(., $search-text)]
return innermost(//*[every $text in $text-nodes satisfies descendant::text() intersect $text])`, 
  htmlDoc, 
  { params : { 'search-text' : searchText } }
);

console.log(xpathResult);
<script src="https://martin-honnen.github.io/Saxon-JS-2.5/SaxonJS2.rt.js"></script>
Martin Honnen
  • 160,499
  • 6
  • 90
  • 110
0

You could just traverse downwards from the outermost element until a descendant has more than 1 child containing the text:

let commonAnc;
let hasTxt = [document.getElementById('mainContainer')]; // or whatever outermost element you want 

while(hasTxt.length == 1) {
  commonAnc = hasTxt[0]; hasTxt = []; 
  let caChilds = commonAnc.children;

  for(let i=0; i<caChilds.length; i++) {
    if (caChilds[i].textContent.includes("SOMETHING")){
      hasTxt.push(caChilds[i]); 
    } 
  }
}

console.log(commonAnc);
<div id="mainContainer"> 
<div id="a"> <!-- A -->
  <div id="b"> <!-- B -->
    <div id="c"> <!-- C: I need to select this element -->
      <div id="d">
        <div>
          <div>
            random string
            <div>
              <div>
                SOMETHING
              </div>
              <div>
                SOMETHING
              </div>
            </div>
          </div>
        </div>
      </div>
      <div>
        <div>
          <div>
            SOMETHING
          </div>
        </div>
      </div>
      <div>
        <div>
          <div>
            SOMETHING
          </div>
        </div>
      </div>
      <div>
        <div>
          <div>
            SOMETHING
          </div>
        </div>
      </div>
      <div>
        <div>
          <div>
            SOMETHING
          </div>
        </div>
      </div>
    </div>
    <div>
      random string
    </div>
  </div>
  <div>
    random string
  </div>
</div>
</div>

Feels rather inefficient, but [I think] it's the simplest way, considering that there seems to be no in-built methods for getting the closest common ancestor...

Driftr95
  • 4,572
  • 2
  • 9
  • 21