XPath with dittoed fields?

Question

In this document if the second column is blank it means use the previous row's value.

<doc>
<table>
<tr><td>ASU</td><td>CS</td><td>3</td></tr>
<tr><td>ASU</td><td>English</td><td>3</td></tr>
<tr><td>ASU</td><td></td><td>4</td></tr>
<tr><td>ASU</td><td>French</td><td>3</td></tr>
</table>
<table>
<tr><td>CMU</td><td>CS</td><td>4</td></tr>
<tr><td>CMU</td><td>English</td><td>3</td></tr>
<tr><td>CMU</td><td>French</td><td>3</td></tr>
<tr><td>CMU</td><td></td><td>4</td></tr>
</table>
<table>
<tr><td>SDSU</td><td>English</td><td>3</td></tr>
<tr><td>SDSU</td><td></td><td>4</td></tr>
<tr><td>SDSU</td><td></td><td>5</td></tr>
<tr><td>SDSU</td><td>French</td><td>4</td></tr>
</table>
</doc>

I want rows were the second columns are English so these would be the rows:

<tr><td>ASU</td><td>English</td><td>3</td></tr>
<tr><td>ASU</td><td></td><td>4</td></tr>
<tr><td>CMU</td><td>English</td><td>3</td></tr>
<tr><td>SDSU</td><td>English</td><td>3</td></tr>
<tr><td>SDSU</td><td></td><td>4</td></tr>
<tr><td>SDSU</td><td></td><td>5</td></tr>

What would the XPath be for this?

score 2 · Answer 1 · answered Dec 10 '16 at 23:43

(This is using XPath 1.0, there may be better solutions with more recent XPath versions).

First, you want trs, so that’s straightforward:

/doc/table/tr[...some predicate...]

The rows you want are either:

Those with where the second tr just contains “English”
```
tr[2] = 'English'
```
Or those where the second tr is empty...
```
tr[2] = ''
```
and, looking at the previous sibling rows which don’t have an empty second tr...
```
preceding-sibling::tr[td[2] != '']
```
the first one ([1]) has a second tr that contains “English”
```
/td[2] = 'English'
```

So combining all that, a query that gives you the desired rows is:

/doc/table/tr[td[2] = 'English'
  or (td[2] = ''
    and preceding-sibling::tr[td[2] != ''][1]/td[2] = 'English')]

Thanks for your comment about my answer. I'll delete it. – Bill Bell Dec 11 '16 at 17:50 — Bill Bell, Dec 11 '16 at 17:50

XPath with dittoed fields?

1 Answers1