4

Task:

So my first foray into Selenium and I am attempting to:

  1. Find the number of pages in a pagination set listed at the bottom of https://codingislove.com/ This is purely to support task 2 by determining the loop end.
  2. Loop over them

I believe these are linked but for those that want a single issue. I simply want to find the correct collection and loop over it to load each page.

The number of pages is, at time of writing, 6 as seen at the bottom of the webpage and shown below:

Pagination set

As an MCVE I simply want to find the number of pages and click my way through them. Using Selenium Basic.

What I have tried:

I have read through a number of online resources, I have listed but a few in references.

Task 1)

It seems that I should be able to find the count of pages using the Size property. But I can't seem to find the right object to use this with. I have made a number of attempts; a few shown below:

bot.FindElementsByXPath("//*[@id=""main""]/nav/div/a[3]").Size '<==this I think is too specific
bot.FindElementsByClass("page-numbers").Size

But these yield the run-time error 438:

"Object does not support this property or method"

And the following doesn't seem to expose the required methods:

bot.FindElementByCss(".navigation.pagination")

I have fudged with

bot.FindElementsByClass("page-numbers").Count + 1 

But would like something more robust

Task 2)

I know that I can navigate to the next page, from page 1, with:

bot.FindElementByXPath("//*[@id=""main""]/nav/div/a[3]").Click

But I can't use this in a loop presumably because the XPath needs to be updated. If not updated it leads to a runtime error 13.

Run-time error 13

As the re-directs follow a general pattern of

href="https://codingislove.com/page/pageNumber/"

I can again fudge my way through by constructing each URL in the loop with

bot.Get "https://codingislove.com/page/" & i & "/"

But I would like something more robust.

Question:

How do I loop over the pagination set in a robust fashion using selenium? Sure I am having a dense day and that there should be an easy to target appropriate collection to loop over.

Code - My current attempt

Option Explicit
Public Sub scrapeCIL()
    Dim bot As New WebDriver, i As Long, pageCount As Long

    bot.Start "chrome", "https://codingislove.com"
    bot.Get "/"
    pageCount = bot.FindElementsByClass("page-numbers").Count + 1 '

    For i = 1 To pageCount 'technically can loop from 2 I know!
      ' bot.FindElementByXPath("//*[@id=""main""]/nav/div/a[3]").Click 'runtime error 13
       ' bot.FindElementByXPath("//*[@id=""main""]/nav/div/a[2]/span").Click ''runtime error 13
        bot.Get "https://codingislove.com/page/" & i & "/"
    Next i

    Stop

    bot.Quit
End Sub

Note:

Any supported browser will do. It doesn't have to be Chrome.

References:

  1. Finding the number of pagination buttons in Selenium WebDriver
  2. http://seleniumhome.blogspot.co.uk/2013/07/how-can-we-automate-pagination-using.html

Requirements:

  1. Selenium Basic
  2. ChromeDriver 2.37 'Or use IE but zoom must be at 100%
  3. VBE Tools > references > Selenium type library
QHarr
  • 83,427
  • 12
  • 54
  • 101
  • QHarr is this solved? – drec4s Apr 18 '18 at 13:12
  • @drec4s Not as yet. I can see how anonygoose 's suggestion might work but I have been unable to implement in a loop with .Click (including with ignoring errors) – QHarr Apr 18 '18 at 13:14
  • Not really an answer because I'm just directing you to other questions with the answer, but you need to get yourself to the right frame to find the element to click. Look at my question [here](https://stackoverflow.com/questions/43808508/driving-a-website-using-vba-and-selenium) and the follow up question [here](https://stackoverflow.com/questions/43873072/driving-a-website-using-vba-and-selenium-pt2). – FreeMan Apr 18 '18 at 13:20
  • @FreeMan Thanks. Will do. – QHarr Apr 18 '18 at 13:21

4 Answers4

3

To click the element, it must be visible in the screen, so you need to scroll to the bottom of the page first (selenium might do this implicitly some times, but I don't find it reliable).

Try this:

Option Explicit
Public Sub scrapeCIL()
    Dim bot As New WebDriver, btn As Object, i As Long, pageCount As Long

    bot.Start "chrome", "https://codingislove.com"
    bot.Get "/"
    pageCount = bot.FindElementsByClass("page-numbers").Count

    For i = 1 To pageCount

        bot.ExecuteScript ("window.scrollTo(0,document.body.scrollHeight);")

        Application.wait Now + TimeValue("00:00:02")

        On Error Resume Next
        Set btn = bot.FindElementByCss("a[class='next page-numbers']")
        If btn.IsPresent = True Then
            btn.Click
        End If
        On Error GoTo 0

    Next i

    bot.Quit

End Sub
drec4s
  • 7,946
  • 8
  • 33
  • 54
2

Similar principle:

Option Explicit

Public Sub GetItems()
    Dim i As Long

    With New ChromeDriver
        .Get "https://codingislove.com/"

        For i = 1 To 6
            .FindElementByXPath("//*[@id=""main""]/nav/div/a[3]").SendKeys ("Keys.PageDown")

            Application.Wait Now + TimeValue("00:00:02")
            On Error Resume Next
            .FindElementByCss("a.next").Click
            On Error GoTo 0
        Next i
    End With
End Sub

Reference:

'http://seleniumhome.blogspot.co.uk/2013/07/how-to-press-keyboard-in-selenium.html

QHarr
  • 83,427
  • 12
  • 54
  • 101
1

If you're only interested in clicking through each of the pages (and getting the number of pages is just an aid to doing this) then you should be able to click this element until it's no longer there:

<span class="screen-reader-text">Next Page</span>

Using

bot.FindElementByXpath("//span[contains(text(), 'Next Page')]")

Have a loop click that link on each page load. Eventually it wont be there. Then use VBA's error/exception handling to handle whatever the equivalent of NoSuchElementException is in this implementation of WebDriver. You will need to re-find the element each time in the loop.

anonygoose
  • 741
  • 3
  • 11
  • Hi, So how would I deploy this? I tried bot.FindElementByXPath ("//span[contains(text(), 'Next Page')]").Click but that didn't work. Runtime error 13. Could I use css to try and reach this? – QHarr Apr 18 '18 at 12:29
  • If it's the same text as the runtime error shown in your question, I would say you need to add some kind of wait before clicking the button. The error seems to suggest something may be overlapping it for a period of time. Try adding a 5 second wait and seeing if it helps. If it does, you may need to research VBA/Selnium Basic waits. The error you posted says something like "Not clickable. Other element would receive the click" – anonygoose Apr 18 '18 at 12:32
  • That sounds like you're doing a FindElementsByClass somewhere and using spaces in the class. E.g. bot.FindElementsByClass("this does not work in selenium webdriver"); – anonygoose Apr 18 '18 at 12:36
  • Sorry. I deleted comment because I realised that. I am getting the same error 13 element not clickable at..... I introduced up to 10 second wait times. – QHarr Apr 18 '18 at 12:37
  • So I think what's going on is the behaviour where articles load as you go further down the page. As WebDriver reaches the bottom of the page, the last article that loads up moves up and over the navigation buttons. I think WebDriver is only reaching the bottom of the page when you tell it to click the navigation link, so the wait isn't working. You need to be at the bottom of the page before the click command, basically. That way all articles will have loaded, and nothing will be above the navigation links. At this point, I feel like your bodge of loading the pages via URL may be easiest. – anonygoose Apr 18 '18 at 12:41
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/169243/discussion-between-qharr-and-anonygoose). – QHarr Apr 18 '18 at 12:42
1

How about trying like this? Few days back I could figure out that there is an option .SendKeys("keys.END") which will lead you to the bottom of a page so that the driver can reach out the expected element to click. I used If Err.Number <> 0 Then Exit Do within the do loop so that if the scraper encounters any error, it will break out of loop as in, element not found error in this case when the clicking on the last page button is done.

Give this a shot:

Sub GetItems()
    Dim pagenum As Object

    With New ChromeDriver
        .get "https://codingislove.com/"

        Do
            On Error Resume Next
            Set pagenum = .FindElementByCss("a.next")
            pagenum.SendKeys ("Keys.END")
            Application.Wait Now + TimeValue("00:00:03")
            pagenum.Click
            If Err.Number <> 0 Then Exit Do
            On Error GoTo 0
        Loop
        .Quit
    End With
End Sub

Reference to add to the library:

Selenium Type Library
SIM
  • 21,997
  • 5
  • 37
  • 109
  • Nice. I think it get's to page 3, the ... pages and errors out but I can see how to use this. +1 – QHarr Apr 18 '18 at 15:48
  • Would be very nice if there was any option for `Explicit Wait` instead of `Hardcoded Delay`. – SIM Apr 18 '18 at 15:50
  • I have only just started with selenium. Don't some of the methods allow for an explicit wait? Or is that just timeout wait? – QHarr Apr 18 '18 at 15:54
  • Vba with selenium binding doesn't have that option. On the other hand, the `explicit wait option` is available when it binds with python. – SIM Apr 18 '18 at 15:56
  • At some point I will have to start combing your Python answers. I have let that side of things slip. – QHarr Apr 18 '18 at 15:57