-3

I'm writing a (dynamic) RegEx parser for plain HTML to extract values. To show the current RegEx result, I want to show the HTML code highlighted with the current result.

I have the HTML code in a variable and my RegEx match function returns the current result and the index of the result.

Therefore, I injected a <span class="highlight"> to the position of the found string, and close it by </span> after the length of the match.

My current problem is: If I set the element with element.text(newtext), my injected HTML is not parsed.

On the other hand, if I set the element with element.html(newtext), the full HTML is parsed and possibly broken.

If I first escape the HTML tags to not be parsed, the index of RegEx.match of the raw html is not equal to the escaped html, therefore the injection is on the wrong place.

Is there a nice way to handle html code as plain text, but injection of highlighting html goes to the right place using an index?

Example

Think of a full html source code - this is a one line snippet of my weather station: <td bgcolor="#EDEFEF"><input name="inHumi" disabled="disabled" type="text" value="63" /></td> The user enters a RegEx matching value=", therefore value=" should be highlighted in the html string.

Sample code

The JS code looks like this. Think of all set variables are user-defined in an HTML form, therefore the user enters the RegEx and the HTML source code as text, and the RegEx match should be visible in that form. var rawhtml = '<td bgcolor="#EDEFEF"><input name="inHumi" disabled="disabled" type="text" value="63" /></td>'; var regex = 'value=\"'; var result = rawhtml.match(regex); var result_string = result[0]; var result_index = result.index; $("#visiblehtml").html(rawhtml.substring(0, result_index) + "<span class='highlight'>" + rawhtml.substring(index, index + result_string.length) + "</span>" + rawhtml.substring(index + result_string.length)); Inserting this span in the middle of the input field, the html will break. Also, setting the element with .html will parse and show the full rawhtml, not the rawhtml source.

  • 5
    Please post a [MCVE] with your JS and HTML – ecg8 Aug 07 '18 at 22:33
  • 1
    The "and possibly broken" portion of this question could use some additional explanation, yes; it's also not clear why you'd be wanting to inject html at an index point rather than at a specific DOM node, or why you're trying to escape the html instead of render it including the added spans. (If the problem is that your regex sometimes matches parts of the HTML tags or attributes, you'll want to restrict it to just working on the text nodes instead of the whole document.) Can you please put together a simplified example demonstrating where the trouble is? – Daniel Beck Aug 08 '18 at 00:12

2 Answers2

0

Instead of inserting the spans straight away, use the indexes to split the string into an array. Something like:

"<p>highlight me</p>"

["<p>", "highlight me", "</p>"]

[ "&lt;p&gt;", "highlight me", "&lt;p&gt;"]

"&lt;p&gt;<span class="highlight">highlight me</span>&lt;p&gt;"
Ben West
  • 4,398
  • 1
  • 16
  • 16
0

Here I answer my own question with my solution. I was thinking much to complicated and now have a pragmatic way.

  • I first inject an own, unique token to the raw html.
  • Then I encode the html (here I use the htmlEncode function from https://stackoverflow.com/a/14346506/3466839).
  • Last I replace my tokens by the real highlight code.
  • Then I can set the result into the element with .html.

var rawhtml = '<td bgcolor="#EDEFEF"><input name="inHumi" disabled="disabled" type="text" value="63" /></td>'; var regex = 'value=\"'; var result = rawhtml.match(regex); var result_string = result[0]; var result_index = result.index; var visiblehtml = rawhtml.substring(0, result_index) + "***highlight*" + rawhtml.substring(index, index + result_string.length) + "*highlight***" + rawhtml.substring(index + result_string.length); var visiblehtml = htmlEncode(visiblehtml); visiblehtml = visiblehtml.replace('***highlight*', '<span class="highlight">'); visiblehtml = visiblehtml.replace('*highlight***', '</span>'); $("#visiblehtml").html(visiblehtml);

I'm sorry that my question was not good and was downvoted. If you leave comments what I have done wrong, I'll do that better next time.

Thanks for your engagement helping so many people!