0

This question is similar to "Allowing new line characters in javascript regex" but the solution /m not runs with str.replace. You can test the code below at this page

 <p id="demo"><i>I need to TRIM the italics here, 

  despite this line.</i>
 </p>

 <button onclick="myFunction()">Try it</button>

 <script>
 function myFunction()
 {
 var str=document.getElementById("demo").innerHTML; 
 var n=str.replace(/^(\s*)<i>(.+)<\/i>(\s*)$/m,"$1$2$3"); //tested also /s
 alert(str)
 document.getElementById("demo").innerHTML=n;
 }
 </script>
Community
  • 1
  • 1
Peter Krauss
  • 13,174
  • 24
  • 167
  • 304

2 Answers2

1

This answer is mostly to give you some insight into why your current approach does not work, and how you generally solve it.

The reason m doesn't help is that the other answer is wrong. This is not what m does. m simply makes the anchors match line beginnings and endings in addition to the string beginnings and endings. Some regex flavors have s for what you want to accomplish, but not ECMAScript. The simplest thing (and general solution) is to replace . (which matches everything except line breaks) with [\s\S] (which matches whitespace and non-whitespace, i.e. everything).

However, Casimir's approach is better in your case, as it avoids some other problems like greediness. Of course, as Casimir said, if there are tags in between the opening and closing <i> tags, then the approach will not work. In that case, something like <i>([\s\S]+?)</i> might be an option, but that's still not the full solution, in case you have nested i-tags or attributes in the opening tag, or capitalized I-tags and whatnot.

All in all, using regex to parse HTML is wrong! You should really use DOM manipulation. Especially, since you are using Javascript - THE language for DOM manipulation. What you should really do is traverse the DOM for all i tags in your demo element, and replace them with their inner HTML.

Community
  • 1
  • 1
Martin Ender
  • 43,427
  • 11
  • 90
  • 130
  • Thanks! using `/^(\s*)([\s\S]+)<\/i>(\s*)$/` it works! About "parse HTML is wrong", the example is only an illustration, and to parse by regex is recomended WHEN using only HTML fragment and a task with a "regex face" (see more answers and discussion at your link). – Peter Krauss Jun 07 '13 at 04:04
  • @PeterKrauss sure, you should just be aware that you run into problems once there could be nesting or more complicated versions of your tags (not even speaking of HTML comments) – Martin Ender Jun 07 '13 at 10:07
0

A way to avoid problems with newlines is to not use the dot, example:

var n=str.replace(/<i>([^<]+)<\/i>/,"$1");

I have replaced the dot by [^<] (all that is not a <, that include newlines)

the m modifier is not needed here, and you don't need to capture white characters too.

Note that my solution suppose that you don't have any < between <i> and </i>

In the other case, when you have nested tags for example, you can use this trick to avoid lazy quantifier:

var n=str.replace(/<i>((?:[^<]+|<+(?!\/i>)+)<\/i>/,"$1");
Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125