0

Having opened this link in Chrome, I get to the PDF file that opens in Chrome.
On this page there are buttons for downloading, printing, etc:
enter image description here

When viewing the page code, it does not display any information on the download and print buttons.
When I do view the code by hovering the mouse over the download icon, the code for this button opens, which is not available through a regular DOM.

As I understand it, you can use Shadow DOM here.

How do I access the download button and click it to the VBA? enter image description here

Community
  • 1
  • 1
giovanii111
  • 47
  • 1
  • 8

1 Answers1

1

You could access the pdf by HTTP Request and save the data that way. The data is coming from the web, so it's going to be from a Request. You dont need chrome automated with VBA to do this action. Also Webdriver has good support: Downloading pdf file using WebRequests

idk if selenium has it, but i would search more if i were you and not use the DOM to download PDF by clicking elements.

Peyter
  • 474
  • 3
  • 14
  • Thanks for your reply. The example you cited is too complicated for me and is not written on the VBA. In my case, just clicking on the download button on the PDF page is enough – giovanii111 Jul 10 '20 at 20:32
  • No you really dont want to be clicking stuff in the DOM to download a pdf or any file, when it's already loaded. Look at this example, in VBA. https://stackoverflow.com/questions/43038325/how-to-download-a-pdf-file-from-browser-using-excel-vba – Peyter Jul 10 '20 at 20:39
  • This method still does not work in my case. PDF files are generated dynamically by the site and do not have separate links. each PDF file has the same link. Still, I need to somehow click on the download icon in the viewing window of the PDF file in Chrome through Shadow DOM, but I can not find how to do it on the VBA – giovanii111 Jul 11 '20 at 11:07
  • Take a look back at what you're actually doing right? So you found a website and it has lists of PDF manuals. They have an API which is the data2.domain part. I dont need to see all your code, but it would help to understand how you're getting the list of links to scrape. Here's why, the API is giving you the option to add a parameter, I went to the site, and inspected the Web Traffic With Fiddler + Chrome Dev tools. You can add this to any link &take=binary to get this link https://data2.manualslib.com/pdf/7/642/64176-haier/wm6002a.pdf?80517c2674ab60981d9ebd32535ca98d&take=binary – Peyter Jul 11 '20 at 18:10
  • The link for the download is not random or dynamic. Its behind a captcha, which can be an issue, but you can solve that captcha and scrape the site with the link, just add &take=binary and the API automatically sends the HTTP REQUEST to a download action. I noticed also in the page, that theyre doing some kind of AJAX response in the javascript. Fiddler should be able to catch that or some tool maybe Dev Tools, if you set the right kind of breakpoint on the javascript library / resources. Regardless, the querystring parameter should do the trick and save you the click into shadow dom – Peyter Jul 11 '20 at 18:13