1

Which library/module is the best to use for downloading large 500mb+ files in terms of speed, memory, cpu? I was also contemplating using pycurl.

joe schmoe
  • 221
  • 1
  • 3
  • 4
  • Similar question: http://stackoverflow.com/questions/1517616/stream-large-binary-files-with-urllib2-to-file – Sam Dolan Aug 04 '10 at 02:58
  • thanks, so it looks like i gotta choose between mechanize and pycurl – joe schmoe Aug 04 '10 at 03:03
  • ...or "neither", if you like my answer that was selected to that question;-). I'm sure either of them would be fine, but they're mostly about negotiating protected access -- they can't speed up your downloads!-) You might try (directly or via Twisted) getting the huge file in pieces, if the server supports that kind of access (that's what a download manager program would do for you, and it might be better optimized and fine-tuned than anything you're going to code up;-). – Alex Martelli Aug 04 '10 at 03:21

1 Answers1

0

At sizes of 500MB+ one has to worry about data integrity, and HTTP is not designed with data integrity in mind.

I'd rather use python bindings for rsync (if they exist) or even bittorrent, which was initially implemented in python. Both rsync and bittorrent address the data integrity issue.

jedi_coder
  • 1,388
  • 1
  • 11
  • 13