0

I am wanting to exact all elements from a table with id = statsTable, and want all the data which I can then read into a csv.

Here is what I have so far:

// Create a request for the URL. 
WebRequest request = WebRequest.Create("http://www.pgatour.com/stats/stat.120.html");
Console.WriteLine("Requesting data from: http://www.pgatour.com/stats/stat.120.html");

// If required by the server, set the credentials.
request.Credentials = CredentialCache.DefaultCredentials;

WebResponse response = request.GetResponse();

using (Stream stream = response.GetResponseStream())
{
    StreamReader reader = new StreamReader(stream);

    // covert html to string
    String responseString = reader.ReadToEnd();

    HtmlDocument doc = new HtmlDocument();

    doc.LoadHtml(responseString);

    var desktopFolder = Environment.GetFolderPath(Environment.SpecialFolder.DesktopDirectory);
    var fullFileName = Path.Combine(desktopFolder, "GolfStats.csv");

    using (var PlayerFile = new StreamWriter(fullFileName))
    {
        PlayerFile.WriteLine("Data downloaded: " + DateTime.Now);

        var myTable = doc.DocumentNode
                        .Descendants("table")
                        .Where(table => table.Attributes.Contains("id"))
                        .SingleOrDefault(table => table.Attributes["id"].Value == "statsTable");

        var myTableValues = myTable.Descendants("td");

        foreach (var tdV in myTableValues)
        {
            PlayerFile.WriteLine(tdV.InnerText);
            Console.WriteLine(tdV.InnerText);
        }

        PlayerFile.Flush();
    }
}

The problem is my csv is simply listing the data in a single column, aswell as picking up an ad which is placed in the table (see url in the webRequest). If you can help me output the data in a table format this would be superb!

Mikael Dúi Bolinder
  • 2,080
  • 2
  • 19
  • 44
Matt D. Webb
  • 3,216
  • 4
  • 29
  • 51

1 Answers1

1

You create a new line for each table cell. To change it so that each table row has a seperate line replace

var myTableValues = myTable.Descendants("td");
foreach (var tdV in myTableValues)
{
    PlayerFile.WriteLine(tdV.InnerText);
    Console.WriteLine(tdV.InnerText);
}

with

var myTableRows = myTable.Descendants("tr").Where(tr => tr.Attributes.Contains("id"));
foreach (var tr in myTableRows)
{
    string line = string.Join(";", tr.Descendants("td").Select(td => td.InnerText));
    PlayerFile.WriteLine(line);
    Console.WriteLine(line);
}

The .Where(tr => tr.Attributes.Contains("id")) filters out the ad since the table row with the ad has no id while all player rows have.

Raidri
  • 17,258
  • 9
  • 62
  • 65
  • This is almost what I want! If I change the string.Join(";" to string.Join("," the csv out out is almost exactly what I need, except the first two columns?! These still list in the first column. – Matt D. Webb Mar 08 '14 at 23:14
  • Seperating the columns with comma will not work correctly since some columns contain commas. Therefore I used semicolons like Excel does for csv files. – Raidri Mar 08 '14 at 23:21
  • Ah okay! How do I get this to be laid out into the same columns like the web page? – Matt D. Webb Mar 08 '14 at 23:27