Javascript traversing based on exact order?

Question

I would like to grasp some information in documents with different formats.

I have the following document:

var getSORMARC = document.evaluate("//*[@id='marcview']/tbody/tr[contains(., '245')]/following-sibling::tr[contains(.,'_c')]/td[contains(.,'_c')]/following-sibling::td[1]", document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null);
if (getSORMARC.singleNodeValue !== null) {
  var SORMARC = getSORMARC.singleNodeValue.innerText;
}
console.log(SORMARC);

<table id="marcview">
  <tbody>
    <tr>
      <td>
        <b>Title</b>
      </td>
      <td>245</td>
      <td>&nbsp;</td>
      <td>0</td>
      <td>_a</td>
      <td>Title of the document /</td>
    </tr>
    <tr>
      <td>_c</td>
      <td>Author no. 1</td>
    </tr>
  </tbody>
</table>

and this other document:

var getSORMARC = document.evaluate("//*[@id='marcview']/tbody/tr[contains(., '245')]/following-sibling::tr[contains(.,'_c')]/td[contains(.,'_c')]/following-sibling::td[1]", document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null);
if (getSORMARC.singleNodeValue !== null) {
  var SORMARC = getSORMARC.singleNodeValue.innerText;
}
console.log(SORMARC);

<table id="marcview">
  <tbody>
    <tr>
      <td>
        <b>Title</b>
      </td>
      <td>245</td>
      <td>&nbsp;</td>
      <td>0</td>
      <td>_a</td>
      <td>Title of another document/</td>
    </tr>
    <tr>
      <td>
        <b>Publication</b>
      </td>
      <td>260</td>
      <td>&nbsp;</td>
      <td>&nbsp;</td>
      <td>_c</td>
      <td>1995</td>
    </tr>
  </tbody>
</table>

As you can see, I used this XPath selector for both these documents:

//*[@id='marcview']/tbody/tr[contains(., '245')]/following-sibling::tr[contains(.,'_c')]/td[contains(.,'_c')]/following-sibling::td[1]

The problem is that if the document doesn't contain an element with text content "_c" and which is directly an ancestor (child) of a parent with text content "245", it still gives me the text of the sibling of _c of the <td> containing text "Publication" which should not be the case.

If the javascript code is ran, it will give me the following: First document: Author no. 1 Second document: (Nothing).

I actually only wanted to capture the text content if that _c has direct ancestor <td>245</td> or <td>Title ...</td>.

I am on my wits end on how to do it. I'm trying to start my xpath with _c but I'm getting some errors. Any idea on how to go about my use case?

If it can be achieved other than using document.evaluate(), I'm fine with it.

Is this html provided or do you generate it ? Because it clearly lacks some attributes to qualify content (classes, ids, etc ) — Apolo, Jun 18 '19 at 07:39
btw I don't understand what you are trying to do. Maybe you could rephrase with "Objective" / "What I tried" / "Expected result" / "Actual result" kind of question ? — Apolo, Jun 18 '19 at 07:42
Have you tried https://stackoverflow.com/questions/3103962/converting-html-string-into-dom-elements — Ajeet Kumar, Jun 18 '19 at 07:42
@Apolo to be honest, I re-read the question about five times and I'm not sure I got it, either. But I *think*, OP wants to find the `tr` that contains `245` then the *following* `tr` that contains `_c` and the content of `td` after it. Maybe. I'm not good with XPath, so I might be wrong. It's worth clarifying because it might be possible to answer without using (or knowing) XPath. — VLAZ, Jun 18 '19 at 07:45
@VLAZ so you mean "Author no. 1" and "1995" in the two provided examples ? — Apolo, Jun 18 '19 at 07:47
@Apolo if my interpretation of the XPath is correct, then yes. Of course it hinges on my interpretation there. — VLAZ, Jun 18 '19 at 07:48
@VLAZ I think that it's what they currently have, but what they want is that the second snippet returns nothing. Tried to redesign their question a bit without changing the meaning (I hope), but I must admit I'm not 100% confident on what they are after either (my XPath skillz are not that great...). — Kaiido, Jun 18 '19 at 07:51
@Kaiido great, so we have three people in comments with basic XPath abilities and we aren't even sure we read the question correctly... — VLAZ, Jun 18 '19 at 07:53
@VLAZ good point, I won't try to answer unless OP makes an edit to explains better what he is looking for — Apolo, Jun 18 '19 at 07:56
Apolo, the html is auto-generated. I just deleted some attributes in the html to lessen/simplify the post. — schnydszch, Jun 18 '19 at 08:38
VLAZ, only Author no. 1. and Kaiido, you are correct with your edits. I added a sample of what I''m trying to achieve. Thanks! — schnydszch, Jun 18 '19 at 08:43

Javascript traversing based on exact order?

0 Answers0