0

I just have an upper-intermediate level of excel and intermediate background of VBA with excel. What I want to do is to import tables from a webpage like this which appears in the link: http://www.admision.unmsm.edu.pe/res20130914/A/011/0.html

That webpage shows 39 links and each link contains a table. So I would like to know an automatic way to import all these tables to excel.

CreamStat
  • 2,155
  • 6
  • 27
  • 43
  • Have a look at `Get External data` `From Web` in Excel (in Excel 2010, on the Data tab) – chris neilsen Dec 30 '13 at 01:24
  • I've just revised that but it seems that I have to extract the tables manually page by page and there are 40 pages which are similar and their only difference is that there are different elements in the tables. I want an automatic way to extract, for example, tables from 100 pages that have the same structure. – CreamStat Dec 30 '13 at 01:53
  • The `Get External data` is a good way to do this. You could either create 40 tables in a workbook (one for each link) or one table, with some VBA to loop through the links and update the connection string and refresh. – chris neilsen Dec 30 '13 at 01:58
  • I get it, which code vba should I use to extract tables from webpages? – CreamStat Dec 30 '13 at 02:00
  • See [this](http://stackoverflow.com/a/5543093/445425) and [this](http://stackoverflow.com/a/8766541/445425) – chris neilsen Dec 30 '13 at 02:02

1 Answers1

2

This code get the data from all the links.

Sub Extract_data()

    Dim url As String, links_count As Integer
    Dim i As Integer, j As Integer, row As Integer
    Dim XMLHTTP As Object, html As Object
    Dim tr_coll As Object, tr As Object
    Dim td_coll As Object, td As Object

    links_count = 39
    For i = 0 To links_count

        url = "http://www.admision.unmsm.edu.pe/res20130914/A/011/" & i & ".html"

        Set XMLHTTP = CreateObject("MSXML2.XMLHTTP")
        XMLHTTP.Open "GET", url, False
        XMLHTTP.send

        Set html = CreateObject("htmlfile")
        html.body.innerHTML = XMLHTTP.ResponseText

        Set tbl = html.getelementsbytagname("Table")

        Set tr_coll = tbl(0).getelementsbytagname("TR")

        For Each tr In tr_coll
            j = 1
            Set td_col = tr.getelementsbytagname("TD")

            For Each td In td_col
                Cells(row + 1, j).Value = td.innerText
                j = j + 1
            Next
            row = row + 1
        Next
    Next

    MsgBox "Done"
End Sub
Santosh
  • 12,175
  • 4
  • 41
  • 72