I was thinking of using BeautifulSoup but I'm not that good.
Basically, this is the page with Egyptian translations:
https://mjn.host.cs.st-andrews.ac.uk/egyptian/texts/corpus/pdf/
It's very basic, you have many link with PDFs and each link has a name.
Since the PDFs themselves are named with a jumble of numbers, I'd like to attach the correct name to the PDFs (i.e. the link name. Not sure if I can leave the commas in the names though).
Thanks in advance!
EDIT: Forgot to add my code:
import os, requests, bs4
url = 'https://mjn.host.cs.st-andrews.ac.uk/egyptian/texts/corpus/pdf/'
os.makedirs('Egypt', exist_ok=True)
res = requests.get(url)
res.raise_for_status()
soup = bs4.BeautifulSoup(res.text, 'html.parser')
document = soup.select('#I do not know what to do')
name = #I do not know how to retrieve it
docFile = open(os.path.join('Egypt', name), 'wb')
for chunk in res.iter_content(100000):
docFile.write(chunk)
docFile.close()