0

I am creating a program to loop through downloading a file from an internet site using Selenium with VBA and a Chrome browser. The program runs well, but 15% of the files end up missing, not having been downloaded even though the call was issued. I have tried lots of different ways, but still end up with missing files. I am running the program on my local computer and downloading to my local hard drive. How can I check that the file is fully downloaded? Others have asked this same question, and it appears not to have great solutions. Others that are using Chrome browsers have been able to open another tab by sending a ctrl-t command and opening the chrome downloads webpage (chrome://downloads/). Opening a new browser tab does not seem to work consistently. For example: https://github.com/danwagnerco/selenium-vba/issues/50. Is there way to detect a completely downloaded file through the file system object or some other way? It's easy to tell that a file has started to be created. But I haven't found a way to tell when the file has stopped being used by another process (ie a download process). Is there a way to do this through the file system?

2 Answers2

0

Can you clear your browser's cache and retry? I've noticed weird things before the cache is cleaned, and after, things seems to run smoothly. Although, almost of my experience has been with Selenium + Python, and NOT Selenium + Excel/VBA. Anyway, try it and see if it helps. If not, can you share your code here? All we can do is speculate, if you never even share the code.

ASH
  • 20,759
  • 19
  • 87
  • 200
  • I have several pages of code, and plenty that I have rewritten from trials that didn't work. I do think that part of this is a VBA specific issue. I'm not sure what I could share of code that would be useful. It's not a browser cache issue, since the browser starts up a fresh new browser on each run. – Alex Campbell Feb 21 '21 at 03:54
0

The challenge is that the Selenium implementation in VBA does not include all of the capabilities that are available with other coding languages. However I did find a way to solve this. I got the breakthrough clue from the second answer in this question: How to open a new window on a browser using Selenium WebDriver for python?. Here is a sample program based on the answer above:

Sub Test()
    Dim Driver As New WebDriver
    
    Driver.Start "chrome"
    Driver.Get ("https://linkedin.com")
    Debug.Print Driver.Window.Title     ' LinkedIn
    
    ' open new tab
    Driver.ExecuteScript ("window.open('https://twitter.com')")
    Debug.Print Driver.Window.Title     ' LinkedIn
    Driver.SwitchToNextWindow
    Debug.Print Driver.Window.Title     ' Twitter

    ' Update new tab
    Debug.Print "Twitter window should go to facebook "
    Debug.Print Driver.Window.Title     ' Twitter
    Driver.Get ("http://facebook.com")
    Debug.Print Driver.Window.Title     ' FaceBook

    ' Update old tab
    Driver.SwitchToPreviousWindow
    Debug.Print "Linkedin should go to gmail "
    Debug.Print Driver.Window.Title     ' LinkedIn
    Driver.Get ("http://gmail.com")
    Debug.Print Driver.Window.Title     ' Gmail

    ' Update new tab
    Driver.SwitchToPreviousWindow
    Debug.Print "Facebook window should go to Google "
    Debug.Print Driver.Window.Title     ' FaceBook
    Driver.Get ("http://google.com")
    Debug.Print Driver.Window.Title     ' Google
    
    Driver.Quit
End Sub

Selenium in VBA also has a SwitchToWindowByTitle option. In testing, that option did not work for me. It seemed to confuse the browser as to which window it should use. I also found that the first time you open a new window, I needed to use SwitchToNextWindow before it was accessible. But that each subsequent switch needed to be to the previous window. I suspect that VBA would not be able to manage switching between three tabs/windows. But two was enough for my purposes. I also found that the driver title sometimes did not get updated in time for the debug.print line (mostly for Twitter), but I could see that the switching was working properly. This is likely due to the fluent, asynchronous load pattern used on many websites. They populate a minimum amount of the page and release it to the user and then continue finishing to load the page afterwards. The window title seems to be one of the later elements populated on some of the test pages here.