3

Is there a way to select only innermost divs (i.e. divs that do not contain other divs) in Jsoup?

To clarify: I am referring to divs only. That is, if a div contains elements that aren't divs but it doesn't contain any div, it is considered (for my case) an "innermost div".

BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555
Regex Rookie
  • 10,432
  • 15
  • 54
  • 88

2 Answers2

3

Jsoup works with CSS selectors. But what you want is not possible with a CSS selector. So this is out of question. You'd need to examine every single div in a loop.

Elements divs = document.select("div");
Elements innerMostDivs = new Elements();

for (Element div : divs) {
    if (div.select(">div").isEmpty()) {
        innerMostDivs.add(div);
    }
}

// ...
BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555
  • AFAIAC you are *the* expert on Jsoup. If you say there isn't a single expression in JSoup to do this (as there is in [PHP](http://stackoverflow.com/questions/4010274/domxpath-select-the-innermost-divs)), then I should look no further. :) – Regex Rookie Aug 19 '11 at 21:08
  • You're welcome. Admittedly, I first tried with `document.select("div:not(>div)")` to see if Jsoup doesn't have an sneaky trick builtin to make that to work. But, unfortunately, no. – BalusC Aug 19 '11 at 21:10
1

You can use a selector like div:not(:has(div)) -- i.e. "find divs that do not contain divs".

Elements innerMostDivs = doc.select("div:not(:has(div))");
Jonathan Hedley
  • 10,442
  • 3
  • 36
  • 47