0

I am new to programming and making a program in VB.Net. this program is supposed to read data table from http://www.xe.com/currencytables/?from=AUD&date=2014-09-18 and save the table in a text file. I have been researching through the web but am unable to get any answer. Would love it if someone can help me with this. Below is what i have till now

Private Sub Button6_Click(sender As Object, e As EventArgs) Handles Button6.Click

    Dim document As New HtmlAgilityPack.HtmlDocument
    Dim myHttpWebRequest = CType(WebRequest.Create("http://www.xe.com/currencytables/?from=AUD&date=2014-09-18"), HttpWebRequest)

    myHttpWebRequest.UserAgent = "Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)"
    Dim streamRead = New StreamReader(CType(myHttpWebRequest.GetResponse(), HttpWebResponse).GetResponseStream)
    Dim res As HttpWebResponse = myHttpWebRequest.GetResponse()
    document.Load(res.GetResponseStream, True)

    Dim tabletag2 As HtmlNode = document.DocumentNode.SelectSingleNode("//div[@class='ICTtableDiv']//tbody")
    If tabletag2 IsNot Nothing Then
        My.Computer.FileSystem.WriteAllText("C:\temp\test.txt", tabletag2.InnerHtml, False)
    Else
        MsgBox(Nothing)
    End If
    Debug.WriteLine("finished")
End Sub

This saves a text file but the data in the text file is the html code of the table. I only need table text. can anyone please help?

the Html table in the above mentioned link looks like this

<div class="ICTtableDiv">
                        <table id='historicalRateTbl' class='tablesorter ICTTable'>
                            <thead> 
                                <tr>
                                    <th class="ICTCurrencyCode">
                                        Currency code
                                        <span class="nonSortAppend">&#9650;&#9660;</span>
                                    </th>
                                    <th class="ICTCurrencyName">
                                        Currency name
                                        <span class="nonSortAppend">&#9650;&#9660;</span>
                                    </th>
                                    <th class="ICTRateHeader">Units per AUD</th>
                                    <th class="ICTRateHeader">AUD per Unit</th>
                                </tr>
                            </thead>
                            <tbody>
                        <tr><td><a href='/currency/usd-us-dollar'>USD</a></td><td>US Dollar</td><td class="ICTRate">0.8982463498</td><td class="ICTRate">1.1132803381</td></tr><!-- <tr><td><a href='/currency/usd-us-dollar'>USD</a></td><td>US Dollar</td><td class="ICTRate">1.5525826958</td><td class="ICTRate">0.6440880751</td></tr> --><tr><td><a href='/currency/eur-euro'>EUR</a></td><td>Euro</td><td class="ICTRate">0.6955704202</td><td class="ICTRate">1.4376689563</td></tr><!-- <tr><td><a href='/currency/eur-euro'>EUR</a></td><td>Euro</td><td class="ICTRate">1.2973942472</td><td class="ICTRate">0.7707757316</td></tr> --><tr><td><a href='/currency/gbp-british-pound'>GBP</a></td><td>British Pound</td><td class="ICTRate">0.5485743518</td><td class="ICTRate">1.8229069527</td></tr><!-- <tr><td><a href='/currency/gbp-british-pound'>GBP</a></td><td>British Pound</td><td class="ICTRate">0.6505821652</td><td class="ICTRate">1.5370848656</td></tr> --><tr><td><a href='/currency/inr-indian-rupee'>INR</a></td><td>Indian Rupee</td><td class="ICTRate">54.5819382185</td><td class="ICTRate">0.0183210790</td></tr>

What i want is

USD US Dollar 0.8982463498 1.1132803381

for each entry in the table.

1 Answers1

1

Following approach works with the website and the desired tabl. It writes to a file all extracted lines where each field is separated by comma (change String.Join("," ...) as desired).

This is a hybrid of loops and LINQ which i find more readable (in VB.NET):

Dim table = document.DocumentNode.SelectSingleNode("//table[@class='tablesorter ICTTable']")
Dim allCSVLines As New List(Of String)
If table IsNot Nothing Then
    Dim rows = table.SelectNodes("tr")
    If rows Is Nothing AndAlso table.SelectSingleNode("tbody") IsNot Nothing Then
        rows = table.SelectSingleNode("tbody").SelectNodes("tr")
    End If
    For Each row As HtmlNode In rows
        Dim fields = From td In row.SelectNodes("th|td").Cast(Of HtmlNode)()
                     Select td.InnerText
        Dim csvLine = String.Join(",", fields)
        allCSVLines.Add(csvLine)
    Next
    File.WriteAllLines("C:\temp\test.txt", allCSVLines)
End If

The result (shortened because 166 rows in total):

USD,US Dollar,0.8982463498,1.1132803381
EUR,Euro,0.6955704202,1.4376689563
GBP,British Pound,0.5485743518,1.8229069527
INR,Indian Rupee,54.5819382185,0.0183210790
AUD,Australian Dollar,1.0000000000,1.0000000000
CAD,Canadian Dollar,0.9832756941,1.0170087657
SGD,Singapore Dollar,1.1388903049,0.8780476888
CHF,Swiss Franc,0.8394278948,1.1912875498
MYR,Malaysian Ringgit,2.9181565764,0.3426820919
JPY,Japanese Yen,97.6309788591,0.0102426506
CNY,Chinese Yuan Renminbi,5.5165706143,0.1812720384
NZD,New Zealand Dollar,1.1033232455,0.9063526977
....

Since you have problems to get the desired result, this is the code i've used to load the document. It's the same code that you've posted above. So it's not clear why it doesn't work for you:

Dim document As New HtmlAgilityPack.HtmlDocument
Dim myHttpWebRequest = CType(WebRequest.Create("http://www.xe.com/currencytables/?from=AUD&date=2014-09-18"), HttpWebRequest)
myHttpWebRequest.UserAgent = "Mozilla/5.0 (compat ble; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)"
Dim streamRead = New StreamReader(CType(myHttpWebRequest.GetResponse(), HttpWebResponse).GetResponseStream)
Dim res As HttpWebResponse = CType(myHttpWebRequest.GetResponse(), HttpWebResponse)
document.Load(res.GetResponseStream, True)
Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
  • Hi Tim, your solution looks awesome. I tried both solutions and getting error on the following line Dim tables = From table In document.DocumentNode.SelectNodes("//table").Cast(Of HtmlNode)() The error says "Values cant be null" – fahad khan Sep 30 '14 at 11:21
  • @fahadkhan: give me a minute. I will edit the question so that it works with the website and the desired table(there are multiple). – Tim Schmelter Sep 30 '14 at 11:26
  • Hi Tim, thank you so much for your response. I tried the new code but its not creating the file in my desired path. Am i missing something? – fahad khan Sep 30 '14 at 11:39
  • Have you tried `File.WriteAllLines("C:\temp\test.txt", allCSVLines)`? Btw, use the debugger and look at `allCSVLines`, how many strings does it contain? Does it enter the `If table IsNot Nothing` at all? Are you using the same website as you've posted above? – Tim Schmelter Sep 30 '14 at 11:40
  • Yes i am using File.WriteAllLines("C:\temp\test.txt", allCSVLines) I even used Msgbox(allCSVlines) to see if the data is displayed. But it does not even show the msgbox. I think the code is ending somewhere and not reaching till file.writealllines. – fahad khan Sep 30 '14 at 11:46
  • @fahadkhan: you should really use this as an opportunity to learn how to use the debugger ;-) It's obvious that it doesn't enter the `If`. I've edited my answer to show the code which loads the document. However, it's yours. – Tim Schmelter Sep 30 '14 at 11:53
  • An you are a star. Yes i was missing something. You solved my problem and its working awesome :D Thanks a lot – fahad khan Sep 30 '14 at 11:56