How to use python to extract the file download link for a given website?

Question

I have googling for two days and can't solve it. Here is my question:
    I want to crawl some MAC software from http://macdownload.informer.com/ what i really need is to get the real download link for each software. For example: Enter in http://macdownload.informer.com/basex/download/ ,and then click the download button.
    The page will be redirected and the will popup download dialog. Through F12 in the browser,i find respond code is 302 and the file real link in the response hreader['location'].
    My question is how can i get the response header in pathon. My python code like this :

    response = requests.get('macdownload.informer.com/basex/download/?cf29b90&p555c=1')  # i just get reponse code 200 while My expectation is 302
    real_download_link = response.headers['location']  
    print real_download_link

But the result is not correct,what i expect is link as [ files.basex.org/releases/8.2/BaseX82.zip ]
Then I check the download button, and I find ajax operation.So I use selenium to simulate the click opration, And yes ,it works.But I can't get the response header in selenium.
So，Can anyone help me solve the problem. No matter you write in python directly to extract the response header and get the location field. Or use selenium to get the response header. The selenium as follows:

def parse_soft(self,response):
    soft_url = response.selector.xpath('//div[contains(@id,"download_content")]/div[2]/a/@href').extract()
    try:
        self.browser.set_page_load_timeout(15)
        self.browser.get(soft_url[0])
    except Exception,ex:
        print "Excetion!    " + str(ex)
    self.browser.find_element_by_class_name("download_btn").click()
    # TODO: Here i want to get the  response header

Check this http://stackoverflow.com/questions/18439851/downloading-file-using-selenium — Andersson, Feb 23 '17 at 11:09
@B.Adler .Yes. I have set the cookie which is get from seleium for the request.get() function.But it not work. It just return response code 200 rather than 302. — zcg396464628, Feb 24 '17 at 03:19
@Andersson . Thanks for your reply. But what i need is the real download link for the file rather than just download the file in the browser. — zcg396464628, Feb 24 '17 at 03:21
not just the cookie, the other headers as well, the user-agent, referer, encoding, etc. — B.Adler, Mar 01 '17 at 15:26

How to use python to extract the file download link for a given website?

0 Answers0