-3

Here is my example string...

<span>&nbsp;</span><span class="citation_text" id="_148511159">Rawls, Wilson.&nbsp;
<i>Where the Red Fern Grows: The Story of Two Dogs and a Boy</i>. Garden City, NY: Doubleday, 1961. Print.</span>

I want to remove all text within the < and > however keep the ones for <i> and </i>. The closest I have gotten is with this piece of code

string.replace(/<.[^i]+?>/g,"")

however it return this

&nbsp;<span class="citation_text" id="_148511159">Rawls, Wilson.&nbsp;
<i>Where the Red Fern Grows: The Story of Two Dogs and a Boy</i>. Garden City, NY: Doubleday, 1961. Print.

How do I get it to remove the final span that is held in the < and >?

Thanks!

UPDATE: This is what I would like the output to be.

&nbsp;Rawls, Wilson.&nbsp; <i>Where the Red Fern Grows: 
The Story of Two Dogs and a Boy</i>. Garden City, NY: Doubleday, 1961. Print.
Tyler Bell
  • 837
  • 10
  • 30

1 Answers1

1

Note: As others said, you shouldn't use regex to parse HTML.
But if you really want a regex, here is one that removes tags except <i> ones.

Regex

/<\/?(?!i>)\w+.*?>/g

This expression will match both opening and closing tags.

You can look at the example below or at this demo.

Example

var str = '<span>&nbsp;</span><span class="citation_text" id="_148511159">Rawls, Wilson.&nbsp; <i>Where the Red Fern Grows: The Story of Two Dogs and a Boy</i>. Garden City, NY: Doubleday, 1961. Print.</span>';

var result = str.replace(/<\/?(?!i>)\w+.*?>/g, '');

console.log(result);

Explanation

  • <\/? matches tag opening and possible slash (for closing tags).
  • (?!i>) prevents the match if following characters are i>.
    It will exclude <i> and </i> tags.
  • \w+ represents the tag name (for example span).
  • .*?> is for any characters that follows the tag name (or nothing) before closing the tag.
Community
  • 1
  • 1
Niitaku
  • 835
  • 9
  • 19