0

The website Download the GLEIF Golden Copy and Delta Files.

has buttons that download data that I want to retrieve automatically with a python script. Usually when I want to download a file, I use mget or similar, but that will not work here (at least I don't think it will).

For some reason I cannot fathom, the producers of the data seem to want to force one to manually download the files. I really need to automate this to reduce the number of steps for my users (and frankly for me), since there are a great many files in addition to these and I want to automate as many as possible (all of them).

So my question is this - is there some kind of python package for doing this sort of thing? If not a python package, is there perhaps some other tool that is useful for it? I have to believe this is a common annoyance.

martineau
  • 119,623
  • 25
  • 170
  • 301
elbillaf
  • 1,952
  • 10
  • 37
  • 73
  • 1
    if page uses JavaScript to add elements then you may need [Selenium](https://selenium-python.readthedocs.io/) to control real web browser which can run JavaScript. – furas Aug 22 '20 at 00:20
  • 1
    This is off-topic. Please see [help/on-topic], [ask]. – AMC Aug 22 '20 at 00:22

1 Answers1

2

Yup, you can use BeautifulSoup to scrape the URLs then download them with requests.

ahelium
  • 86
  • 1
  • 5