0

For a personal project I'm working on, I want to get traffic data off of a website. The website in question shows this in a cell on a table. Can I simply have my program connect to the site, open up the page, and then get the contents of that cell as a string? Using mainly C# in the .NET framework.

  • 2
    Well, that's the question you asked, "Can I...". I answered that question. You seem to be interested in knowing if this is possible; it is. You can now start implementing a solution in full confidence that you won't later find out it's impossible. – Servy Oct 12 '12 at 18:44
  • What you want to do is called **screen scraping** or **web scraping** – Colonel Panic Oct 12 '12 at 18:45
  • The process you're looking for is called web scraping. There is a lot of different ways to do this in a lot of different programming languages. I found this pretty quickly and it might be of some help to you http://stackoverflow.com/questions/4377355/i-need-a-powerful-web-scraper-library – sunnyrjuneja Oct 12 '12 at 18:46

3 Answers3

2

This is an operation commonly known as "web scraping". You can do it manually using WebClient:

using System.Net;

using (WebClient client = new WebClient ()) 
{
    html = client.DownloadString(@"http://somesite.com/somepage.html");        
}

Then parse through the string to look for the data you want. This could be easy or very hard, depending on the complexity of the page you're scraping.

A better way is to use a web scraping library like HTML Agility Pack.

System Down
  • 6,192
  • 1
  • 30
  • 34
0

Assuming it's a simple GET, use System.Net.WebClient to DownloadString(...) and then look for the cell's content using a RegExpr.

Kirk B.
  • 456
  • 2
  • 6
0

Take a look at WebFetch.

It is a pretty good tutorial and sample code on fetching HTTP content.

oberfreak
  • 1,799
  • 13
  • 20