Allowing new line characters in javascript str.replace

Question

This question is similar to "Allowing new line characters in javascript regex" but the solution /m not runs with str.replace. You can test the code below at this page

 <p id="demo"><i>I need to TRIM the italics here, 

  despite this line.</i>
 </p>

 <button onclick="myFunction()">Try it</button>

 <script>
 function myFunction()
 {
 var str=document.getElementById("demo").innerHTML; 
 var n=str.replace(/^(\s*)<i>(.+)<\/i>(\s*)$/m,"$1$2$3"); //tested also /s
 alert(str)
 document.getElementById("demo").innerHTML=n;
 }
 </script>

score 1 · Accepted Answer · edited May 23 '17 at 12:05

This answer is mostly to give you some insight into why your current approach does not work, and how you generally solve it.

The reason m doesn't help is that the other answer is wrong. This is not what m does. m simply makes the anchors match line beginnings and endings in addition to the string beginnings and endings. Some regex flavors have s for what you want to accomplish, but not ECMAScript. The simplest thing (and general solution) is to replace . (which matches everything except line breaks) with [\s\S] (which matches whitespace and non-whitespace, i.e. everything).

However, Casimir's approach is better in your case, as it avoids some other problems like greediness. Of course, as Casimir said, if there are tags in between the opening and closing  tags, then the approach will not work. In that case, something like ([\s\S]+?) might be an option, but that's still not the full solution, in case you have nested i-tags or attributes in the opening tag, or capitalized I-tags and whatnot.

All in all, using regex to parse HTML is wrong! You should really use DOM manipulation. Especially, since you are using Javascript - THE language for DOM manipulation. What you should really do is traverse the DOM for all i tags in your demo element, and replace them with their inner HTML.

Thanks! using `/^(\s*)([\s\S]+)<\/i>(\s*)$/` it works! About "parse HTML is wrong", the example is only an illustration, and to parse by regex is recomended WHEN using only HTML fragment and a task with a "regex face" (see more answers and discussion at your link). — Peter Krauss, Jun 07 '13 at 04:04
@PeterKrauss sure, you should just be aware that you run into problems once there could be nesting or more complicated versions of your tags (not even speaking of HTML comments) — Martin Ender, Jun 07 '13 at 10:07

Casimir et Hippolyte · Answer 2 · 2013-06-07T04:40:32.027

0

A way to avoid problems with newlines is to not use the dot, example:

var n=str.replace(/<i>([^<]+)<\/i>/,"$1");

I have replaced the dot by [^<] (all that is not a <, that include newlines)

the m modifier is not needed here, and you don't need to capture white characters too.

Note that my solution suppose that you don't have any < between  and 

In the other case, when you have nested tags for example, you can use this trick to avoid lazy quantifier:

var n=str.replace(/<i>((?:[^<]+|<+(?!\/i>)+)<\/i>/,"$1");

edited Jun 07 '13 at 04:40

answered Jun 06 '13 at 23:52

Casimir et Hippolyte

88,009
5
94
125

hum... Good solution for my "illustrative string", but it is not a solution for "generic string", having p.ex. bold tags into it. – Peter Krauss Jun 07 '13 at 04:06
@PeterKrauss: see my edit. An example must be an example, not an illustration. – Casimir et Hippolyte Jun 07 '13 at 04:42

Allowing new line characters in javascript str.replace

2 Answers2