3

I want to select the heading element (whether h1, h2, h3, h4, h5 or h6) that is closest above the form using XPath in PHP.

<h2>Foo</h2>
<h3>Bar</h3>
<form>
    <input />
</form>

The example above should return the h3 (Bar) because it is closest to the form.

<h4>Kee</h4>
<form>
    <input />
</form>

This example on the other hand should return the h4 (Kee) because it is closest.

This query (from https://stackoverflow.com/a/2216795/4391251) works fine for just the h2 tags. I could modify it for h1, h3, h4, h5, etc. but I want a catch-all query.

$headings = $xpath->query('((//form)[2]/ancestor::*/h2[.!=""])[last()]');

Basically I want something like this

$headings = $xpath->query('((//form)['.$i.']/ancestor::*/[h2 or h3][.!=""])[last()]');

Except for that doesn't return any results, and neither does (based on https://stackoverflow.com/a/7995095/4391251)

$headings = $xpath->query('((//form)['.$i.']/ancestor::*/[self::h2 or self::h3][.!=""])[last()]');

What query will give the desired results?

Community
  • 1
  • 1
Theo van der Zee
  • 229
  • 3
  • 18

3 Answers3

2

You can try something like this :

$xpath->query('//form['.$i.']/preceding-sibling::*[self::h2 or self::h3][1]')

basically, the xpath get the first preceding sibling of form[i] that is of type <h2> or <h3> (or whatever, just list all other elements as needed in the xpath predicate).

har07
  • 88,338
  • 12
  • 84
  • 137
  • 1
    Excellent! I ended up using it like this: $headings = $xpath->query('((//form)['.$i.']/preceding-sibling::*[self::h2 or self::h3])[1]'); – Theo van der Zee Jun 11 '15 at 09:04
  • could you post sample HTML that reproduce the problem? – har07 Jun 11 '15 at 09:12
  • *AFAICS*, as explained in this answer, the xpath should return *only the first* preceding sibling of `form`, there is no chance it can return more than one element. – har07 Jun 11 '15 at 09:16
  • 1
    I had already erased the comment I had posted earlier, it was a mistake on my end. The query returns multiple headings (as expected) because there were multiple forms on the page (each with their own heading). The query works as supposed! – Theo van der Zee Jun 11 '15 at 09:34
1

Take the fist h before form

//form/preceding::*[starts-with(name(),'h')][1]
splash58
  • 26,043
  • 3
  • 22
  • 34
-1
/html/body/*[starts-with(name(),'h')]
goto
  • 7,908
  • 10
  • 48
  • 58
awuu
  • 251
  • 2
  • 6