How to get Index of Nokogiri node in HTML Source?

Asked Jun 26 '17 at 18:28

Active Jun 27 '17 at 00:46

Viewed 180 times

Suppose my HTML looks like this:

html = '<HTML><BODY><a id="id1">test</a><a id="id2">test2</a></BODY></HTML>'

I extract the 2nd link: node = doc.css("a#id2")[0]

How do I get the starting index of this node HTML in the HTML source? Which is 32?

html.slice(32, SOMETHING) = '<a id="id2">...'

Note: I know this is a trivial example but the solution should address cases where the node I extract isn't unique in the HTML.

asked Jun 26 '17 at 18:28

Henley

You can convert the doc to a String with .text and use this answer https://stackoverflow.com/a/3520277/2067375 to scan the string by a regular expresion and You will get the starting position of each match. – Rada Bogdan Jun 26 '17 at 20:17

0 Answers0