0

I am new to programming, but I have read a lot of questions and answers in this page that have helped me a lot. I have a code that can look for information on a Web page and stores it on a database, the only problem I have so far is that the text has accented characters and I don´t know how to choose the correct encoding.

 Dim Web1 As New HtmlAgilityPack.HtmlWeb
 Dim Doc1 As New HtmlAgilityPack.HtmlDocument
 Doc1 = Web1.Load("http://207.248.177.30/regulaciones/scd_expediente_3.asp?ID=01/0922/130214)
 Dim uri As Uri = Nothing
    Dim linksOnPage = From link In doc.DocumentNode.Descendants()
                      Where link.Name = "a" _
                      AndAlso link.Attributes("href") IsNot Nothing _
                      Let text = link.InnerText.Trim()
                      Let url = link.Attributes("href").Value
                      Where uri.TryCreate(url, UriKind.RelativeOrAbsolute, uri)
  Dim postabla As Integer = 1
    Dim fecha1 As HtmlNode
    Dim ACR2 As HtmlNode
    Dim ACR3 As HtmlNode
    Dim ACR4 As HtmlNode
    Dim ACR5 As HtmlNode

    For Each link In linksOnPage

        Dim cb As New OleDb.OleDbCommandBuilder(da)
        Dim dsNewRow As DataRow
        Dim value1 As String
        value1 = link.url

        If value1.Contains("207.248.177.30") Then



            dsNewRow = ds.Tables("infoenlace").NewRow()
            dsNewRow.Item("Expediente") = exped
            dsNewRow.Item("TipoDocumento") = link.text
            dsNewRow.Item("EnlaceWeb") = link.url




fecha1 = doc.DocumentNode.SelectSingleNode("//tr[@class='tituloPantalla']/following-sibling::tr[" & postabla & "]/td[2]")
            dsNewRow.Item("Fecha") = fecha1.InnerHtml
            fecha1 = doc.DocumentNode.SelectSingleNode("//tr[@class='tituloPantalla']/following-sibling::tr[" & postabla & "]/td[3]")
            dsNewRow.Item("Remitente") = fecha1.InnerHtml

            postabla = postabla + 2
 ds.Tables("infoenlace").Rows.Add(dsNewRow)
                da.Update(ds, "infoenlace")
 Next

 Return uri
    End Function

Thanks a lot! The problem is that the code instead of recording "Nueva versión." puts "Nueva versi�n."

Vic
  • 766
  • 5
  • 6
  • This might help you: http://stackoverflow.com/questions/3452343/c-sharp-and-htmlagilitypack-encoding-problem?rq=1 – Oscar Mederos Mar 20 '14 at 21:04
  • Thank you Oscar, but that page is for declaring encoding using C#. – Vic Mar 20 '14 at 21:49
  • Thank you Oscar, I had tried fixing my problem using that page, but that page is for declaring encoding using C#. My code uses VB, and I got lost trying to adapt it. When I try to use openRead in the following way: Dim Doc1Client As New WebClient() Doc1 = Web1.Load(Doc1Client.OpenRead("http://207.248.177.30/regulaciones/scd_expediente_3.asp?ID=" & exped), Encoding.UTF8) It shows an error (Value of type 'System.IO.Stream' cannot be converted to 'String' – Vic Mar 20 '14 at 21:59
  • Thank you! I did it at last with your help, I hadn´t tried the last comment until now: – Vic Mar 20 '14 at 22:17

0 Answers0