1

I have a url that when opened, all it does is initiate a download, and immediately closes the page. I need to capture this download (a png) with python and save it to my own directory. I have tried all the usual urlib and urlib2 methods and even tried with mechanize but it's not working.

The url automatically starting a download and then closing is definitely causing some problems.

UPDATE: Specifically it is using Nginx to serve up the file with a X-Accel-Mapping header.

Cœur
  • 37,241
  • 25
  • 195
  • 267
tknickman
  • 4,285
  • 3
  • 34
  • 47
  • first of all take a look at web developer tool s in your browser. – eri Aug 09 '13 at 19:55
  • Take a look here: http://stackoverflow.com/questions/3042757/downloading-a-picture-via-urllib-and-python – Rodrigo Guedes Aug 09 '13 at 19:56
  • What about using wget? http://stackoverflow.com/questions/17872083/what-are-the-performance-related-issues-of-php-curl/17872132#17872132 or are you fixed on python? – Homer6 Aug 09 '13 at 19:58
  • @RodrigoGuedes It does the same thing all the others do, just kicks out a file with the name I give it but it's not a valid png. I can open it in sublime and it's actually html. Which is weird... – tknickman Aug 09 '13 at 20:00
  • @Homer6 Nope it has to be python. It's being built into a pyramid webapp – tknickman Aug 09 '13 at 20:02
  • How is the download triggered? Location header? Meta refresh? JavaScript? – Imran Aug 09 '13 at 21:00
  • Providing a sample URL that we can test against will get you much better results. First guess is that you'll need to enable redirect following in whatever library you are using. Second guess is you should use requests, a nicer python library for http, as they'll handle most cases gracefully. – Liyan Chang Aug 09 '13 at 23:15
  • @Imran it is a custom header I believe. – tknickman Aug 12 '13 at 18:10
  • @LiyanChang I'm running it all locally so unfortunately I can't give a url – tknickman Aug 12 '13 at 18:10
  • See the update in the original question – tknickman Aug 12 '13 at 20:16

2 Answers2

0

There's nothing particularly special about X-Accel-Mapping header. Perhaps the page makes the HTTP request with ajax, and uses the X-Accel-Mapping reader value to trigger the download?

Here's how I'd do it with urllib2:

response = urllib2.urlopen(url_to_get_x_accel_mapping_header)
download_url = response.headers['X-Accel-Mapping']
download_contents = urllib2.urlopen(download_url).read()
Imran
  • 87,203
  • 23
  • 98
  • 131
  • The problem I was having ended up being something different from what I described in my question. But for anyone who comes upon this question later this should have worked. – tknickman Aug 14 '13 at 22:25
0
import urllib
URL= YOUR_URL
IMAGE = URL.rsplit('/',1)[1]
urllib.urlretrieve(URL, IMAGE)

For details to dynamically download images from a url list visit here

Ojas
  • 39
  • 1