2

I'm just trying to do a simple deletion of an element in C#. If my html element contains the text [Store Logo] then I want to remove it. Example:

<img src="http://src.sencha.io/300/80/http://images.company.com/[Store Logo]" />

Since it has [Store Logo] then I'd like to delete the whole image tag. I was trying to use RegEx to do it but it's hard to understand how to use all the symbols together and I read that I'm not supposed to use regex to parse html. What is the best way to remove it?

proseidon
  • 2,235
  • 6
  • 33
  • 57

2 Answers2

3

U can use Html Agility Pack

Here's an example straight from their examples page on how to find all the links in a page:

 HtmlWeb hw = new HtmlWeb();
 HtmlDocument doc = hw.Load(/* url */);
 foreach(HtmlNode link in doc.DocumentElement.SelectNodes("//a[@href]"))
 {
    if(link.Attributes["href"].Value.Contains("[Store Logo]"))
       link.ParentNode.RemoveChild(link, true);
 }
Roar
  • 2,117
  • 4
  • 24
  • 39
0

Use HtmlAgilityPack. It's a library for parsing HTML that allows to to access the DOM and modify it.

System Down
  • 6,192
  • 1
  • 30
  • 34