0

I need to convert snippets of text that contain html tags into plain text using Javascript / Node.Js.

I currently use String.Js library for that, but the problem is that when it removes the tags (using strip_tags() functions), it also removes the new line.

E.g.

   <div>Some text</div><div>another text</div>

becomes

   Some textanother text

Do you know how I could get rid of this problem? Maybe another library?

Thanks!

Aerodynamika
  • 7,883
  • 16
  • 78
  • 137

2 Answers2

1

Try using Cheerio. It will expose a jQuery like interface for you on the server side. Then it's just:

var html = $(htmlstring).html();

Then just traverse the DOM for whatever elements you want and call $(element).text();

Alex Hill
  • 713
  • 1
  • 5
  • 13
0

Hi this is very simple solution of your problem because I'm using reg exp and you can do what you want.

In this case we remove all tags except br tags.If you want you can remove br tag and add another tag maybe \n \t or what you want.

I hope this can help you.

Chears!!!

var html = "<div>Some text</div><div>another text</div><br />test<div>10</div>";
var removeHtmlTags = html.replace(/(<([^>!br]+)>)/ig,"");
console.log(removeHtmlTags);
  • 1
    Look at the OP's specific HTML. It's not even about `
    ` tags. This is far, far too simplistic a solution to do what the OP wants and doesn't even work on the sample HTML the OP provides.
    – jfriend00 Nov 03 '14 at 21:25