0

I want to automatically download pdf files from a pool of sites like these:

https://www.wfp.org/publications?f%5B0%5D=topics%3A2234 https://www.unhcr.org/search?comid=4a1d3b346&cid=49aea93a6a&scid=49aea93a39&tags=evaluation%20report https://www.unicef.org/evaluation/reports#/

I then want to upload them onto my own site.

Can I use Python to build a script for this function? I'd need to scrape the websites periodically so that, as soon as a new file is uploaded, the file is automatically downloaded to my server.

Lastly, assuming I'm sharing these on my own website for non-profit purposes, is this legal?

1 Answers1

1

You can use the python-moduled requests and beautifulsoup4 to periodically scrape the websites and download the pdfs like so Download files using requests and BeautifulSoup .

Then you can save them in your servers web-path and display them dynamically.

I'm not a lawyer but i think this is not legal. Its like secretly recording a movie in the cinema and then sharing it online which is super not legal.

wuerfelfreak
  • 2,363
  • 1
  • 14
  • 29