1

I am using c# console app to get xml document. Now once xmldocument is loaded i want to search for specific href tag:

href="/abc/def

inside the xml document.

once that node is found i want to strip tag completly and just show Hello.

<a href="/abc/def">Hello</a>

I think i can simply get the tag using regex. But can anyone please tell me how can i remove the href tag completly using regex?

NoviceMe
  • 3,126
  • 11
  • 57
  • 117
  • 5
    using regex for this kind of stuff is a bad idea IMHO - if you are dealing with HTML then I would recommend using HTML Agility pack... – Yahia Mar 30 '12 at 19:38
  • 1
    possible duplicate of [Using C# Regular expression to replace XML element content](http://stackoverflow.com/questions/448376/using-c-sharp-regular-expression-to-replace-xml-element-content) – Ken White Mar 30 '12 at 19:41
  • @KenWhite - that is totally different question i looked at it. But no help for my question. – NoviceMe Mar 30 '12 at 19:43
  • @Yahia - It is not html i am loading the xml file in xml document and want to find that particular link and remove – NoviceMe Mar 30 '12 at 19:44
  • 2
    @NoviceMe, if you can show your xml, I guess you can get better answers – L.B Mar 30 '12 at 19:48
  • @NoviceMe there is no link in XML, what you have shown is HTML... – Yahia Mar 30 '12 at 20:05

3 Answers3

1

xml & html same difference: tagged content. xml is stricter in it's formatting. for this use case I would use transformations and xpath queries rebuild the document. As @Yahia stated, regex on tagged documents is typically a bad idea. the regex for parsing is far to complex to be affective as a generic solution.

Jason Meckley
  • 7,589
  • 1
  • 24
  • 45
0

The most popular technology for similar tasks is called XPath. (It is also a key component of XQuery and XSLT.) Would the following perhaps solve your task, too?

root.SelectSingleNode("//a[@href='/abc/def']").InnerText = "Hello";
Jirka Hanika
  • 13,301
  • 3
  • 46
  • 75
0

You could try

string x = @"<?xml version='1.0'?> 
 <EXAMPLE>  
    <a href='/abc/def'>Hello</a> 
 </EXAMPLE>";

 System.Xml.XmlDocument doc = new XmlDocument();
 doc.LoadXml(x);
 XmlNode n = doc.SelectSingleNode("//a[@href='/abc/def']");
 XmlNode p = n.ParentNode;
 p.RemoveChild(n);
 System.Xml.XmlNode newNode = doc.CreateNode("element", "a", "");
 newNode.InnerXml = "Hello";
 p.AppendChild(newNode);

Not really sure if this is what you are trying to do but it should be enough to get you headed in right direction.

Jive Boogie
  • 1,265
  • 1
  • 12
  • 20