I'm trying to download PDF files automatically through VBA, I already managed to automatically navigate to many URLs and extract text from the html via the CSS Selector method querySelectorAll()
and save it into my Excel spreadsheet through VBA.
I also can click javascript buttons and I know how to download PDF files in general but it doesn't seem to work with the PDFs from the website I am working on. It looks like the PDF files don't exist on the servers but only exist as a BLOB (e.g. blob:null/7cea2352-704e-42e2-9da7-2b65082134bb ) and get converted to PDFs on the fly through some javascript code when I click manually on the "download PDF" button in the firefox built-in PDF preview window.
Is there a way on how to access those BLOB files through vba and convert them to files to download them automatically like normal PDF files? I was looking through several tutorials/answered questions but they never showed how to do this with vba but always through javascript only.
My Code so far (the relevant part):
Sub vbaCrawler()
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
IE.Navigate "websiteURL.com"
While IE.Busy Or IE.readyState < 4: DoEvents: Wend
t = Timer
counterX = 1
counterY = 1
Do
DoEvents
On Error Resume Next
Set aNodeList = IE.document.querySelectorAll("#productPartSearchResult td")
On Error GoTo 0
If Timer - t = 10 Then Exit Do
Loop While aNodeList Is Nothing
If Not aNodeList Is Nothing Then
For j = 18 To aNodeList.Length - 1
Worksheets("CurrentStep").Cells(counterY, counterX).Value = aNodeList.Item(j).innerText
If counterX < 9 Then
counterX = counterX + 1
Else
counterX = 1
counterY = counterY + 1
End If
Next j
End If
//[...] bunch of code to format the text data
IE.Quit
Set IE = Nothing
End Sub
I could point excel to the canvas holding the BLOB file but I don't know how to go from there to make excel understand this canvas is actually holding a file that should be downloaded:
here is a screenshot showing what happens if I rightclick the image that I am trying to download to view it in the browser:
I was hoping for a filepath url so I can download that image, but it is just showing the blob url without any .png or .pdf extension which makes it hard for me to work with it. How can I download this file if it doesn't show me the filepath but only this blob url?
And how can I get to this blob url through VBA? right now I only know how to get to it manually by rightclicking the image with my mouse, but I don't find the blob url inside the html source code.