I need to automatically download documents from Web pages (with a Python script). In the HTML pages, links look like this:
href="https://foo.bar/view.php?id=123456"
When I click on such a link in a Web browser, the Web browser opens the document with its correct name - for example: document_1.pdf
.
However, when I download the same document with wget:
$ wget https://foo.bar/view.php?id=123456
I do get the correct document, but under a different name: view.php@id=123456
Now, the real name of the document (document_1.pdf
in this example) appears nowhere in the HTML page. How can I get it?
If it's possible for a Web browser to get at the name of the document, it must be possible also for a script to do so, but how?