I would like to grasp some information in documents with different formats.
I have the following document:
var getSORMARC = document.evaluate("//*[@id='marcview']/tbody/tr[contains(., '245')]/following-sibling::tr[contains(.,'_c')]/td[contains(.,'_c')]/following-sibling::td[1]", document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null);
if (getSORMARC.singleNodeValue !== null) {
var SORMARC = getSORMARC.singleNodeValue.innerText;
}
console.log(SORMARC);
<table id="marcview">
<tbody>
<tr>
<td>
<b>Title</b>
</td>
<td>245</td>
<td> </td>
<td>0</td>
<td>_a</td>
<td>Title of the document /</td>
</tr>
<tr>
<td>_c</td>
<td>Author no. 1</td>
</tr>
</tbody>
</table>
and this other document:
var getSORMARC = document.evaluate("//*[@id='marcview']/tbody/tr[contains(., '245')]/following-sibling::tr[contains(.,'_c')]/td[contains(.,'_c')]/following-sibling::td[1]", document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null);
if (getSORMARC.singleNodeValue !== null) {
var SORMARC = getSORMARC.singleNodeValue.innerText;
}
console.log(SORMARC);
<table id="marcview">
<tbody>
<tr>
<td>
<b>Title</b>
</td>
<td>245</td>
<td> </td>
<td>0</td>
<td>_a</td>
<td>Title of another document/</td>
</tr>
<tr>
<td>
<b>Publication</b>
</td>
<td>260</td>
<td> </td>
<td> </td>
<td>_c</td>
<td>1995</td>
</tr>
</tbody>
</table>
As you can see, I used this XPath selector for both these documents:
//*[@id='marcview']/tbody/tr[contains(., '245')]/following-sibling::tr[contains(.,'_c')]/td[contains(.,'_c')]/following-sibling::td[1]
The problem is that if the document doesn't contain an element with text content "_c"
and which is directly an ancestor (child) of a parent with text content "245"
, it still gives me the text of the sibling of _c
of the <td>
containing text "Publication"
which should not be the case.
If the javascript code is ran, it will give me the following: First document: Author no. 1 Second document: (Nothing).
I actually only wanted to capture the text content if that _c
has direct ancestor <td>245</td>
or <td>Title ...</td>
.
I am on my wits end on how to do it. I'm trying to start my xpath with _c
but I'm getting some errors. Any idea on how to go about my use case?
If it can be achieved other than using document.evaluate()
, I'm fine with it.