5

I have a JS stirng like this
&lt;div id="grouplogo_nav"&gt;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;ul&gt;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;li&gt;&lt;a class="group_hlfppt" target="_blank" href="http://www.hlfppt.org/"&gt;&amp;nbsp;&lt;/a&gt;&lt;/li&gt;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;/ul&gt;<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;/div&gt;

I need to remove all <br> and $nbsp; that are only between &gt; and &lt;. I tried to write a regular expression, but didn't got it right. Does anybody have a solution.

EDIT :

Please note i want to remove only the tags b/w &gt; and &lt;

Nandakumar V
  • 4,317
  • 4
  • 27
  • 47
  • Be careful trying to parse HTML with javascript, it may be detrimental to your health: http://stackoverflow.com/a/1732454/36537 – Phil H Oct 11 '12 at 12:19

6 Answers6

4

Avoid using regex on html!

Try creating a temporary div from the string, and using the DOM to remove any br tags from it. This is much more robust than parsing html with regex, which can be harmful to your health:

var tempDiv = document.createElement('div');
tempDiv.innerHTML = mystringwithBRin;
var nodes = tempDiv.childNodes;
for(var nodeId=nodes.length-1; nodeId >= 0; --nodeId) {
    if(nodes[nodeId].tagName === 'br') {
        tempDiv.removeChild(nodes[nodeId]);
    }
}
var newStr = tempDiv.innerHTML;

Note that we iterate in reverse over the child nodes so that the node IDs remain valid after removing a given child node.

http://jsfiddle.net/fxfrt/

Community
  • 1
  • 1
Phil H
  • 19,928
  • 7
  • 68
  • 105
  • I also dont like using regular expressions for html. But i am using a texteditor plugin which will convert all line breaks to `
    ` and spaces to `
  • ` tags, which will destroy the content. This was the only workaround i found
  • – Nandakumar V Oct 12 '12 at 05:12