Removing a portion of HTML within a HTML page

Question

I am trying to remove some tags with content while loading a page to restrict not sending few tags.

I was doing with search string and its not helpful for larger data set.

string startTag = "<section>"+Environment.NewLine+
"                <div id=\"nonPrintable123\">";

        var startIndex = htmlString.IndexOf(startTag);
        var html = htmlString.Substring(0, startIndex) + "</div></form>      </body></html>";

Is there any way so I could use Regex and remove /replace a whole div- child with empty string?

The Data within <Section> {data} </Section> should be replaced with empty or any other suppression.

Use [HtmlAgilityPack](https://html-agility-pack.net/). Regex is a bad choice for this: https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags — Icemanind, Jan 23 '19 at 18:18
[remove html node from htmldocument :HTMLAgilityPack](https://stackoverflow.com/q/12106280/7444103). -- [Remove](https://html-agility-pack.net/remove). — Jimi, Jan 23 '19 at 18:20

score 0 · Answer 1 · answered Jan 23 '19 at 18:09

0

using String.Replace has worked for me in the past.
https://learn.microsoft.com/en-us/dotnet/api/system.string.replace?view=netframework-4.7.2

startString &= startString.Replace("<div>HTML you want to replace</div>", "")

answered Jan 23 '19 at 18:09

Jbrown

28
4

score 0 · Answer 2 · answered Jan 23 '19 at 18:37

I did with the following piece of code using vb.net:

Private Sub removehtml()
    Dim str As String = " <div id=nonPrintable123> <!--#  Start --> hjhjhty iuh  hwjkednjkb dvhv xcaisfdchascjk bkasj df kh <!--End #-->"
    Dim sindex As Integer = 0
    Dim eindex As Integer = 0
    sindex = str.IndexOf("<!--#")
    eindex = str.IndexOf("#-->")
    Dim substr As String = String.Empty
    substr = str.Substring(sindex, (eindex - sindex) + 4)
    str = str.Replace(substr, String.Empty)
End Sub

By this way I have removed all the non required data from given string

Removing a portion of HTML within a HTML page

2 Answers2