0

I am trying to scrape data from: http://www.boliga.dk/salg/resultater?so=1&sort=omregnings_dato-d&maxsaledate=today&iPostnr=&gade=&type=Villa&minsaledate=2017

IN connection with this I have two questions.

Michał Perłakowski has giving an excellent guide how to scrape but the code he is using is getElementById(Scraping data from website using vba). Since the webpage I want to scrape from does not use an ID. I am wondering what alternatives available. My guess would be getElementsByClassName.

My next question is how to make the macro change page (I have more than 100) can I just write "next"?

Dim appIE As Object
Set appIE = CreateObject("internetexplorer.application")

With appIE
    .navigate "http://www.boliga.dk/salg/resultater?so=1&sort=omregnings_dato-d&maxsaledate=today&iPostnr=&gade=&type=Villa&minsaledate=2017"
    .Visible = True
End With

Do While appIE.Busy
    DoEvents
Loop

Set allRowOfData = appIE.document.getElementsByName("Bygget")

Dim myValue As String: myValue = allRowOfData.Cells(7).innerHTML

appIE.Quit
Set appIE = Nothing

Range("A1").Value = myValue
Community
  • 1
  • 1
thomasuponor
  • 23
  • 1
  • 7

1 Answers1

1

Try this:

Option Explicit

Sub scrape()

    Dim appIE As Object
    Dim ihtml As Object

    Set appIE = CreateObject("internetexplorer.application")

    With appIE

        .Visible = True
        .navigate "http://www.boliga.dk/salg/resultater?so=1&sort=omregnings_dato-d&maxsaledate=today&iPostnr=&gade=&type=Villa&minsaledate=2017"

        While .Busy = True Or .readyState < 4: DoEvents: Wend

        Set ihtml = .document


        Dim allRowOfData As Object

        Set allRowOfData = appIE.document.getElementById("searchresult")

        Dim r As Long, c As Long

        Dim curHTMLRow As Object


        For r = 1 To allRowOfData.Rows.Length - 1

            Set curHTMLRow = allRowOfData.Rows(r)

            For c = 0 To curHTMLRow.Cells.Length - 1 'comment out
                Cells(r + 1, c + 1) = curHTMLRow.Cells(c).innerText    '  Cells(r + 1, c + 1) = curHTMLRow.Cells(7).innerText
            Next c 'comment out

        Next r

        .Quit

    End With

    Set appIE = Nothing

End Sub

Just the column of interest rather than whole table:

Option Explicit

Sub scrape()

    Dim appIE As Object
    Dim ihtml As Object

    Set appIE = CreateObject("internetexplorer.application")

    With appIE

        .Visible = True
        .navigate "http://www.boliga.dk/salg/resultater?so=1&sort=omregnings_dato-d&maxsaledate=today&iPostnr=&gade=&type=Villa&minsaledate=2017"

        While .Busy = True Or .readyState < 4: DoEvents: Wend

        Set ihtml = .document


        Dim allRowOfData As Object

        Set allRowOfData = appIE.document.getElementById("searchresult")

        Dim r As Long, c As Long

        Dim curHTMLRow As Object


        For r = 1 To allRowOfData.Rows.Length - 1

            Set curHTMLRow = allRowOfData.Rows(r)
            Cells(r + 1, c + 1) = curHTMLRow.Cells(7).innerText

        Next r

        .Quit

    End With

    Set appIE = Nothing


End Sub

Reference:

https://www.experts-exchange.com/questions/28571716/Excel-VBA-WEb-Data-Scraping-from-a-Table.html

QHarr
  • 83,427
  • 12
  • 54
  • 101
  • That allowed me to return 1935 – thomasuponor Feb 22 '18 at 12:26
  • Tried to ran it a few times. It returns 1,3, or 4 results. – thomasuponor Feb 22 '18 at 12:30
  • I have now allowed for all table to be returned or if you comment out the two lines mentioned and swop in Cells(r + 1, c + 1) = curHTMLRow.Cells(7).innerText instead of Cells(r + 1, c + 1) = curHTMLRow.Cells(c).innerText then you will have just the column of interest – QHarr Feb 22 '18 at 12:34
  • Now it returns all the data which is alright. But it does not change the page – thomasuponor Feb 22 '18 at 12:42
  • I think it would be good to post how to make the macro change page as a new question and give an example of what you want. Are you expecting to enter a value somewhere and then refresh the page? – QHarr Feb 22 '18 at 14:21
  • Thank you very much for your effort. I am now able to scrape the page. Now I am just working on to make the macro automatically change the page. I will try posting that as a new question. – thomasuponor Feb 22 '18 at 15:07
  • That would be good. I have now seen what you are trying to do, I think. You want to iterate through the different results pages grabbing each set of results from the the results table. Writing them all out into one big long list? – QHarr Feb 22 '18 at 15:08
  • Yes!, you are absolutely right about it. I am making the new question a few minutes. – thomasuponor Feb 22 '18 at 15:16
  • No, I do also have permission to use Google Chrome. – thomasuponor Feb 22 '18 at 15:26
  • Might be faster working with XHR and having XMLHTTP rather than IE. But see what responses you get to the new question. If I have time I will have a look later. – QHarr Feb 22 '18 at 15:28
  • Alright. I will try to look into that also. Would you mind trying to explain me the macro you posted yesterday step by step. Some of it I understand and some of it I dont. Thanks! – thomasuponor Feb 23 '18 at 07:17
  • Basically I target the table by Id. Then, as you would with an array, outer loop rows with inner loop columns. – QHarr Feb 23 '18 at 08:43
  • I had trouble finding the ID but now I see from where you got that. Can you explain this to me: While .Busy = True Or .readyState < 4: DoEvents: Wend – thomasuponor Feb 23 '18 at 10:49
  • that is basically telling the program to loop until webpage has loaded – QHarr Feb 23 '18 at 10:51