Currently, I am writing some node.js code to remove all HTML in a string. This one of the string that I need to process:
"<b>Iconic powerful bass resonance of Bluedio:</b> 57mm ultra-large dynamic drivers, turbine style housing, with the iconic Bluedio surging low-frequency shock, let you feel the bass resonate deep in the chest, enjoying the best sound quality. Clear and transparent bass, mids and treble, fully exposed to all the details of song, you can hear what the artists really want you to hear, Coldplay or Linkin Park concert played in your ear<br>"
This is the code I am using for removing HTML:
html=html.replace(/<\w+>/,'').replace(/<\/\w+>/,'').trim();
This is the output:
"Iconic powerful bass resonance of Bluedio: 57mm ultra-large dynamic drivers, turbine style housing, with the iconic Bluedio surging low-frequency shock, let you feel the bass resonate deep in the chest, enjoying the best sound quality. Clear and transparent bass, mids and treble, fully exposed to all the details of song, you can hear what the artists really want you to hear, Coldplay or Linkin Park concert played in your ear<br>"
As you see, the HTML are not completely removed from string yet. The <br>
which is at the end of the string still remains.
Why does this happen? How can I fix this problem? I want to learn more about regular expression. Please do not give me a URL of a library. Thanks.