0

I got a list with metadata to 100s of files to be download and am trying to find out if it is possible to do that with a script, as of course opening manually 100s of pages, clicking and downloads these docs will be very inefficient. My knowledge of HTML or JS is next to nothing. I'm familiar with Python and BASH.

The metadata contains filenames, hierarchy and a document ID, so that an URL can be built as below:

"https://domain.io/#/documents/download/a5dew436-2dv1-43df-9t39-cdefgxstf289"

If I paste that URL in a browser, I get to a page with a window and a link to the same URL. If that link is clicked, the file is downloaded. But if I right click the link and open in a new tab, the same page loads. In other words, I can't make a script that downloads the file with that link, as it has to be manually clicked. I'm sure I'm missing something here.

The relevant part of the content of the page is:

<a _ngcontent-ojy-c58 class="btn-link bg-star-inserted" 
href="/#/documents/download/a5dew436-2dv1-43df-9t39-cdefgxstf289" 
title="filename.pdf">
filename.pdf
</a> == $0

Now why the difference between manually clicking a link downloads a file, while if the same link is pasted, a webpage open? Is that some sort of hidden magic happening here when I click that link that makes that file to be downloaded? If yes, how can I reproduce that magic via a script (Python, BASH, whatever...)

--- edit ---

What have a tried so far?

So far I have tried to programmatically get the contents of the web page with a link to the file I need to download:

In : import requests
...: cont = requests.get('https://domain.io/#/documents/download/a5dew436-2dv1-43df-9t39-cdefgxstf289')

And the relevant part of cont.content.decode() is below. I do not understand that, all I know is that I cannot programmatically download a file from that.

<body>
  <app-root></app-root>
  <app-redirect></app-redirect>
  <script 
      src="runtime-es2015.8018d72d32ff24b300d9.js" type="module">
  </script>
  <script 
    src="runtime-es5.8018d72d32ff24b300d9.js" nomodule defer> 
  </script>
  <script 
    src="polyfills-es5.94e2668cf22fe39657e2.js" nomodule defer>
  </script>
  <script 
    src="polyfills-es2015.55685fdcf18327bde638.js" type="module">
  </script>
  <script 
    src="scripts.fbbaa5b34df08dd651e4.js" defer>
  </script>
  <script 
    src="main-es2015.8cb6177f1451e573c1f5.js" type="module">
  </script>
  <script 
    src="main-es5.8cb6177f1451e573c1f5.js" nomodule defer>
  </script>
</body></html>
Raf
  • 1,628
  • 3
  • 21
  • 40
  • What have you tried so far? Where are you stuck? Looks like this is some kind of Angular application. Have you checked for the request that is happening to download the file? Maybe it contains more details about the URL that is used? – Nico Haase Jan 21 '22 at 13:31
  • This may point you in the right direction: https://stackoverflow.com/questions/3749231/download-file-using-javascript-jquery?rq=1 – Andrew Corrigan Jan 21 '22 at 13:46
  • 1
    "Now why the difference between manually clicking a link downloads a file, while if the same link is pasted, a webpage open?" — The anchor element getting a click even probably triggers a JS event handler which does the downloading. The URL may be entirely relevant. It isn't [good design](https://en.wikipedia.org/wiki/Unobtrusive_JavaScript). – Quentin Feb 16 '22 at 14:44
  • @Quentin nice tip on bad design. I've blocked JS on the browser and the page doesn't load at all... nothing, just a blank empty page. – Raf Feb 17 '22 at 08:53

0 Answers0