Replace Empty span tag to br tag using Regex

Question

Can any one tell me the Regex pattern which checks for the empty span tags and replace them with tag.

Something like the below :

string io = Regex.Replace(res,"" , RegexOptions.IgnoreCase);

I dont know what pattern to be passed in!

Please note that [regex should not be used to parse HTML](http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454) — , Jan 31 '11 at 20:44

score 2 · Answer 1 · edited Nov 15 '12 at 13:50

2

The code of Jeff Mercado has error at lines:

.Where(e => e.Name.Equals("span", StringComparison.OrdinalIgnoreCase) && n.Name.Equals("span", StringComparison.OrdinalIgnoreCase)

Error message: Member 'object.Equals(object, object)' cannot be accessed with an instance reference; qualify it with a type name instead

They didn't work when I tried replace with other objects!

edited Nov 15 '12 at 13:50

SliverNinja - MSFT

31,051
11
110
173

answered Nov 15 '12 at 13:31

Tri

21
1

Andreas Vendel · Accepted Answer · 2011-01-31T12:19:17.190

2

This pattern will find all empty span tags, such as <span/> and <span></span>:

<span\s*/>|<span>\s*</span>

So this code should replace all your empty span tags with br tags:

string io = Regex.Replace(res, @"<span\s*/>|<span>\s*</span>", "<br/>");

edited Jan 31 '11 at 12:19

answered Jan 31 '11 at 12:14

Andreas Vendel

716
6
14

@Andreas tx for the answer. I also want to check for the span which has space in it as a content for eg : . and replace it with say if content has two spaces then replace them with two &nbsp. – Malcolm Jan 31 '11 at 12:55
@Malcolm Try this: Regex.Replace(html, @"\s*", (match) => match.Value.Replace(" ", "&nbsp")) – Andreas Vendel Jan 31 '11 at 13:41
@Andreas didn't worked . doesn't get replaced :(. tx for your response – Malcolm Jan 31 '11 at 14:16
@Malcolm Did you capture the return value of Replace? It is supposed to be: html = Regex.Replace(html, @"\s*", (match) => match.Value.Replace(" ", "&nbsp")) – Andreas Vendel Jan 31 '11 at 15:17
@Andreas i did. It returns the string and i got the same string in back which i passed in. Do you any idea how to get the content between the two span tags? http://stackoverflow.com/questions/4851721/is-there-anyway-to-get-the-content-of-span-tag-using-regex-in-c – Malcolm Jan 31 '11 at 15:26
Note that this will catch XML tags which are inside strings or javascript code. – Jan 31 '11 at 20:42
@Malcolm I tested the following code: "string html = Regex.Replace(@" testtest", @"\s*", (match) => match.Value.Replace(" ", "&nbsp"));". After it executed, html contained "&nbsp&nbsptesttest". Maybe it is a problem with case? Try adding RegexOptions.IgnoreCase to the replace call. Otherwise I don't know what the problem might be without looking at the code. You might consider starting a new question if you don't get it working. – Andreas Vendel Jan 31 '11 at 20:44
@Andreas it worked for me too. tx for all the help. Another thing i don't know it makes much sense or not whether we put &nbsp or nbsp; – Malcolm Feb 01 '11 at 11:00

score 0 · Answer 3 · edited May 23 '17 at 11:55

0

My favourite answer to this problem is this one: RegEx match open tags except XHTML self-contained tags

edited May 23 '17 at 11:55

Community

1
1

answered Jan 31 '11 at 12:14

Jack Allan

14,554
11
45
57

Jeff Mercado · Answer 4 · 2011-01-31T21:23:25.350

You should parse it, searching for the empty span elements and replace them. Here's how you can do it using LINQ to XML. Just note that depending on the actual HTML, it may require tweaks to get it to work since it is an XML parser, not HTML.

// parse it
var doc = XElement.Parse(theHtml);

// find the target elements
var targets = doc.DescendantNodes()
                 .OfType<XElement>()
                 .Where(e => e.Name.Equals("span", StringComparison.OrdinalIgnoreCase)
                          && e.IsEmpty
                          && !e.HasAttributes)
                 .ToList(); // need a copy since the contents will change

// replace them all
foreach (var span in targets)
    span.ReplaceWith(new XElement("br"));

// get back the html string
theHtml = doc.ToString();

Otherwise, here's some code showing how you can use the HTML Agility Pack to do the same (written in a way that mirrors the other version).

// parse it
var doc = new HtmlDocument();
doc.LoadHtml(theHtml);

// find the target elements
var targets = doc.DocumentNode
                 .DescendantNodes()
                 .Where(n => n.NodeType == HtmlNodeType.Element
                          && n.Name.Equals("span", StringComparison.OrdinalIgnoreCase)
                          && !n.HasChildNodes && !n.HasAttributes)
                 .ToList(); // need a copy since the contents will change

// replace them all
foreach (var span in targets)
{
    var br = HtmlNode.CreateNode("<br />");
    span.ParentNode.ReplaceChild(br, span);
}

// get back the html string
using (StringWriter writer = new StringWriter())
{
    doc.Save(writer);
    theHtml = writer.ToString();
}

@yoda: Well that's where the problem lies. It would require tweaks then. Otherwise using an actual HTML parser (such as the one in the [HTML Agility Pack](http://htmlagilitypack.codeplex.com/)) will be better. Though the code will be slightly different however. — Jeff Mercado, Jan 31 '11 at 20:49

Replace Empty span tag to br tag using Regex

4 Answers4