JavaScript Removing text between < and >

Question

Here is my example string...

<span>&nbsp;</span><span class="citation_text" id="_148511159">Rawls, Wilson.&nbsp;
<i>Where the Red Fern Grows: The Story of Two Dogs and a Boy</i>. Garden City, NY: Doubleday, 1961. Print.</span>

I want to remove all text within the < and > however keep the ones for  and . The closest I have gotten is with this piece of code

string.replace(/<.[^i]+?>/g,"")

however it return this

&nbsp;<span class="citation_text" id="_148511159">Rawls, Wilson.&nbsp;
<i>Where the Red Fern Grows: The Story of Two Dogs and a Boy</i>. Garden City, NY: Doubleday, 1961. Print.

How do I get it to remove the final span that is held in the < and >?

Thanks!

UPDATE: This is what I would like the output to be.

&nbsp;Rawls, Wilson.&nbsp; <i>Where the Red Fern Grows: 
The Story of Two Dogs and a Boy</i>. Garden City, NY: Doubleday, 1961. Print.

Could you show the exact expected output that should be produced from your posted example string? — David Thomas, Jan 26 '17 at 15:41
Instead of aiming for a complicated regex, you could write a couple of lines of jQuery to `unwrap()` the `contents()` of your spans. Just saying. — Frédéric Hamidi, Jan 26 '17 at 15:42
[You. Can't. Parse. HTML. With. Regex](http://stackoverflow.com/a/1732454/1529630) — Oriol, Jan 26 '17 at 15:45
@DavidThomas : Thanks, I just added that to the post! Oriol : I am getting the innerHTML contents first as a string. — Tyler Bell, Jan 26 '17 at 17:37

score 1 · Accepted Answer · edited Jun 20 '20 at 09:12

Note: As others said, you shouldn't use regex to parse HTML.
But if you really want a regex, here is one that removes tags except  ones.

Regex

/<\/?(?!i>)\w+.*?>/g

This expression will match both opening and closing tags.

You can look at the example below or at this demo.

Example

var str = '<span>&nbsp;</span><span class="citation_text" id="_148511159">Rawls, Wilson.&nbsp; <i>Where the Red Fern Grows: The Story of Two Dogs and a Boy</i>. Garden City, NY: Doubleday, 1961. Print.</span>';

var result = str.replace(/<\/?(?!i>)\w+.*?>/g, '');

console.log(result);

Explanation

<\/? matches tag opening and possible slash (for closing tags).
(?!i>) prevents the match if following characters are i>.
It will exclude  and  tags.
\w+ represents the tag name (for example span).
.*?> is for any characters that follows the tag name (or nothing) before closing the tag.

Thanks exactly what I was looking for! – Tyler Bell Jan 26 '17 at 21:10 — Tyler Bell, Jan 26 '17 at 21:10

JavaScript Removing text between < and >

1 Answers1

Regex

Example

Explanation