0

I want to parse a table that appears after clicking the search button and then filter some of the data. How can I do this?

The site is ruspo.ru

My code is:

HttpWebRequest webRequest = WebRequest.Create("http://ruspo.ru/") as HttpWebRequest;
StreamReader responseReader = new StreamReader(webRequest.GetResponse().GetResponseStream());
string responseData = responseReader.ReadToEnd();
responseReader.Close();
webRequest.GetResponse().Close();

MatchCollection m1 = Regex.Matches(responseData, @"(?<=<table class=""ui-widget ui-widget-content""[^>]*>).*?(?=</div>)", RegexOptions.Singleline);

foreach (Match m in m1)
{
    Response.Write(m.ToString());
    //txtPrice.Text = m.ToString();
    //ddlhotels.Text = m.ToString();
}
Petr Janeček
  • 37,768
  • 12
  • 121
  • 145
gunner
  • 13
  • 4

1 Answers1

0

Well, it's not as simple as reading the existing page. The search results are most likely behind some kind of POST, so you'll have to reverse engineer what the POST data looks like, perform the POST yourself, and then read the results.

Also, using Regex to parse HTML is not recommended. You should use the HtmlAgilityPack, which provides actual DOM support and allows you to perform XPath queries on the document structure.

    var doc = new HtmlDocument();
    doc.Load(new StringReader(responseData));

    var nodes = doc.DocumentNode.SelectNodes("//div");
    foreach (HtmlNode link in nodes)
    {
        string title = link.InnerText.Trim();
        // etc.
    }
Community
  • 1
  • 1
mgnoonan
  • 7,060
  • 5
  • 24
  • 27