0

I need to automatically download documents from Web pages (with a Python script). In the HTML pages, links look like this:

href="https://foo.bar/view.php?id=123456"

When I click on such a link in a Web browser, the Web browser opens the document with its correct name - for example: document_1.pdf.

However, when I download the same document with wget:

$ wget https://foo.bar/view.php?id=123456

I do get the correct document, but under a different name: view.php@id=123456

Now, the real name of the document (document_1.pdf in this example) appears nowhere in the HTML page. How can I get it?

If it's possible for a Web browser to get at the name of the document, it must be possible also for a script to do so, but how?

user1387866
  • 2,834
  • 3
  • 22
  • 28
  • 3
    The file download server response contains the file name in [a header](https://stackoverflow.com/questions/1628260/downloading-a-file-with-a-different-name-to-the-stored-name). – James May 26 '17 at 15:42
  • Yes, that's it. Thanks! – user1387866 May 26 '17 at 16:18

1 Answers1

1

Resolved in comments:

The file download server response contains the file name in a header. – James
stovfl
  • 14,998
  • 7
  • 24
  • 51