0

I am using the following regular expression in Javascript:

    comment_body_content = comment_body_content.replace(/(<span id="sc_start_commenttext-(\d+)"><\/span>)
[^]*?(<span id="sc_end_commenttext-\2"><\/span>)/, "$1$3");

I want to find in my HTML code this tag <span id="sc_start_commenttext-330"></span> (the number is always different) and the tag <span id="sc_end_commenttext-330"></span>. Then the text and HTML code between those tags should be deleted and the rest should be given back:

Before:

<span id="sc_start_commenttext-330"></span>
Some Text and some <u>html</u> blabla
<span id="sc_end_commenttext-330"></span>

Returned value of comment_body_content:

<span id="sc_start_commenttext-330"></span>
<span id="sc_end_commenttext-330"></span>

This expression works in all current browsers, but the IE 8 returns a javascript error at the lines, where are "(\d+)" and \2.

Is there a solution for all browsers?

Alex

user1711384
  • 343
  • 1
  • 7
  • 24

3 Answers3

3

This will work.

.replace(/(<span id="sc_start_commenttext-(\d+)"><\/span>)[\S\s.]*?(<span id="sc_end_commenttext-\2"><\/span>)/, "$1$3")

http://jsfiddle.net/4Rx96/5/

Bart
  • 17,070
  • 5
  • 61
  • 80
2

Just change [^]*? in your regex by .*?

in order to deal with line break you'd use : [\s\S]*?

Toto
  • 89,455
  • 62
  • 89
  • 125
  • Hm, its not working correctly in the current browsers. Not all characters between `` and `` were deleted. This could be also \n etc... – user1711384 Apr 26 '13 at 12:03
  • @user1711384: What kind of character isn't deleted? – Toto Apr 26 '13 at 12:05
  • it seems, that its only working when there is no line break in the text. when there is a line break, nothing is deleted...using `[^]*?` works with all characters, but not in IE 8 – user1711384 Apr 26 '13 at 12:17
  • Yeah, good message: In current browsers its working and no errors in IE8. Bad message: not all characters are deleted in IE8. Maybe this could be a reason: In current browsers there is generated a
    -tag, in IE 8 a

    -tag for a new line...

    – user1711384 Apr 26 '13 at 12:31
  • @user1711384: Really strange! `[\s\S]*` stands for any character, it'll match `
    ` as well as `

    ` tags.

    – Toto Apr 26 '13 at 12:47
  • Maybe its a bug in IE - people have to live with that. Thank you! – user1711384 Apr 26 '13 at 13:11
  • Jep, good job ;) This doesnt help for a solution, does it? http://stackoverflow.com/questions/4679271/javascript-regex-substitution-newline-handling-in-browsers – user1711384 Apr 26 '13 at 13:35
0

It is not recommended to process HTML with regular expressions.

This is likely more useful - I'm using jQuery

We have ways to find both start and end if necessary, but the HTML you provided will be handled by this:

DEMO

var comments = {}
$("span[id^='sc_start_commenttext-']").each(function() {
   var idx = this.id.split("-")[1];    
   comments[idx]=$(this).get(0).nextSibling.nodeValue;
});
window.console && console.log(comments["330"])
mplungjan
  • 169,008
  • 28
  • 173
  • 236