How to remove selected special character from list

Question

I have C# list where lot of values like this

Moon

and i want to remove  and .

I want result like this Moon.

How can i remove this type of characters from list.

Your post appears to have been mangled by the formatting code, hard to tell what you started with... — fyjham, Nov 27 '09 at 14:04

score 5 · Accepted Answer · answered Nov 27 '09 at 14:10

You can use XDocument to remove the XML tags:

string StripXmlTags(string xml)
{
    XDocument doc = XDocument.Parse(xml);
    return doc.Root.Value;
}

Example:

[Test]
public void Test()
{
    string xml = "<root><b>nice </b><c>node</c><d><e> is here</e></d></root>";
    string result = StripXmlTags(xml);

    Assert.AreEqual("nice node is here", result);
}

score 1 · Answer 2 · answered Nov 27 '09 at 14:04

1

Try this:

var moonHtml = "<b>Moon</b>";
var regex = new Regex("</?(.*)>", RegexOptions.IgnoreCase | RegexOptions.Multiline);
var moon = regex.Replace(moonHtml, string.Empty);

answered Nov 27 '09 at 14:04

Sani Huttunen

23,620
6
72
79

12 secs faster, what a shame ;) – Elephantik Nov 27 '09 at 14:05
1

Why specify "zero or one /" `/?` when the slash would've been included in the dot that follows? Why specify ignore-case when there are no alphabetic characters? Best practice? Oh well. Your code is greedy. If there's a string like `abc Moon more text more moon` then you'll just end up with "abc ". – David Hedlund Nov 27 '09 at 14:08
1

DO NOT use regex to parse html - http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 – thecoop Nov 27 '09 at 14:29

score 0 · Answer 3 · answered Nov 27 '09 at 14:04

0

Try this:

Regex.Replace("<b>Moon</b>", @"\<.+?\>", "")

answered Nov 27 '09 at 14:04

Elephantik

1,998
1
17
23

score 0 · Answer 4 · answered Nov 27 '09 at 14:06

0

string noHtml = Regex.Replace(inputWithHtmlTags, "<[^>]+>", "");

answered Nov 27 '09 at 14:06

David Hedlund

128,221
31
203
222

score 0 · Answer 5 · answered Nov 27 '09 at 14:34

This program is a very crude illustration of a regex that will remove all tags, it's flexible enough to also remove italic and underlines. It use the IgnoreCase option to guard against or  being in the input and will carry out the search over multiple lines. The output from running this will be "The Man on the Moon". I use .*? meaning zero or more to catch cases of empty brackets such as <>

using System;
using System.Text.RegularExpressions;

namespace ConsoleApplication3
{
    class Program
    {
       static void Main(string[] args)
       {
           var input = "<b>The</b> <i>Man</i> on the <U><B>Moon</B></U>";

           var regex = new Regex("<.*?>", RegexOptions.IgnoreCase | RegexOptions.Multiline);

           var output = regex.Replace(input, string.Empty);

           Console.WriteLine(output);
           Console.ReadLine();
      }
    }

}

How to remove selected special character from list

5 Answers5