0

I want to read the content of the Remote Webpage in asp.net using C#. I have read it using the following code in asp.net.

 protected void Page_Load(object sender, EventArgs e)
    {
        string TheUrl = "http://www.demosite.com/Default.aspx";
        string response = GetHtmlPage(TheUrl);
        Response.Write(response);

    }
static string GetHtmlPage(string strURL)
    {

        String strResult;
        WebResponse objResponse;
        WebRequest objRequest = HttpWebRequest.Create(strURL);
        objResponse = objRequest.GetResponse();
        using (StreamReader sr = new StreamReader(objResponse.GetResponseStream()))
        {
            strResult = sr.ReadToEnd();
            sr.Close();
        }
        return strResult;
    }

Here i get the whole content of the Remote WebPage now i want read the content Tag by tag and get Only the Content of it. Is it possible?

Help Appreciated...! Thanks in advance!

SHEKHAR SHETE
  • 5,964
  • 15
  • 85
  • 143
  • As per @atticae suggestion to use the HTML Agility Pack to parse HTML, here's a simple example that may be useful http://stackoverflow.com/a/10579599/122005 – chridam Oct 09 '12 at 11:44

1 Answers1

0

Use the HTML Agility pack to traverse the elements. It's the best way to parse HTML.

You should be able to get all the text nodes with

doc.DocumentNode.SelectNodes("//text()[normalize-space(.) != '']")
magnattic
  • 12,638
  • 13
  • 62
  • 115