2

Using import.io, given the following snippet, after successfully extracting the name and time columns, how might one extract the nearest preceding .heading element as a third column using XPath?

...

<div class="row-fluid">
    <div class="heading">HBO</div>
</div>
<div class="row-fluid">
    <div class="name">Silicon Valley</div>
    <div class="time">9pm</div>
</div>
<div class="row-fluid">
    <div class="name">The Wire</div>
    <div class="time">10pm</div>
</div>
...
<hr>

<div class="row-fluid">
    <div class="heading">ABC</div>
</div>
<div class="row-fluid">
    <div class="name">Lost</div>
    <div class="time">9pm</div>
</div>
<div class="row-fluid">
    <div class="name">Heroes</div>
    <div class="time">10pm</div>
</div>
...
<hr>

...
Abel
  • 56,041
  • 24
  • 146
  • 247
gpmcadam
  • 6,346
  • 2
  • 33
  • 37
  • Nearest preceding or following `.heading` element? – Wiktor Stribiżew Sep 07 '15 at 20:27
  • @stribizhev The nearest element that comes before the matched data with a class of "heading". – gpmcadam Sep 07 '15 at 20:28
  • I removed your regex (hope you can forgive my candor). [Never ever use regular expressions to process (X)HTML](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454). Besides, even if it _had and an answer_, it would be on conflicting subjects which would be better suited for different questions. – Abel Sep 07 '15 at 21:25
  • Try `//div/div[@class='name']/parent::div/preceding-sibling::div/div[@class='heading']/text()`. Tested at http://www.xpathtester.com/xpath. Please check. – Wiktor Stribiżew Sep 07 '15 at 21:36
  • I was just going to post `./parent::div/preceding-sibling::div/div[@class='heading‌​']/text()` but I guess you have got a more expanded answer. – Wiktor Stribiżew Sep 07 '15 at 21:47
  • @stribizhev, using `parent::x/preceding-sibling::y/z` syntax works here but is subtly different from `preceding::z`. If the structure of the document is strictly as given above, there is not much difference, but I'd probably prefer the shorter `preceding::z`. Also, it is good practice that you add `[1]`; though XSLT 1.0 will select the first node, it is not so if you apply templates and in XSLT 2.0 it will select all nodes. Using `[1]` makes the intention clearer ;). – Abel Sep 07 '15 at 22:18

1 Answers1

1

The nearest element that comes before the matched data with a class of "heading".

The nearest preceding element from a given element can be found with the preceding axis in XPath. Suppose we have the expression div/div[class='name'][. = 'Heroes'], which selects the last name in your example, the nearest preceding one would be:

./preceding::div[@class = 'heading'][1]

where . is either a genuine context node in which case you can remove ./, or it should be replaced with the rest of the expression that you already have.

Since the preceding axis counts backwards, we just want the first element found. Note that the preceding axis does not select ancestors or self nodes, counting from the current node.

Abel
  • 56,041
  • 24
  • 146
  • 247