1

I'm trying to download a video using python.

I'm trying to speed up download using multithreading, but I'm not able to come up with solutions in requests or urllib2.

Also if anyone can give a code on how to solve it that would be really helpful.

Here is the code I was trying:

import requests
http_proxy  = "http://edcguest:edcguest@172.31.100.29:3128"
https_proxy = "https://edcguest:edcguest@172.31.100.29:3128"
ftp_proxy   = "ftp://edcguest:edcguest@172.31.100.29:3128"

proxyDict = { 
          "http"  : http_proxy, 
          "https" : https_proxy, 
          "ftp"   : ftp_proxy
        }

def download_file(url):
    resume_byte_pos = 0
    end_byte_pos = 432526330
    # NOTE the stream=True parameter
    resume_header = {'Range': 'bytes=%d-%d(resume_byte_pos,end_byte_pos)}
    #r = requests.get(url, stream=True, proxies=proxyDict)
    r = requests.get(url, stream=True,proxies=proxyDict,headers=resume_header)
    print r.headers


    with open('ab2.mp4', 'wb') as f:
         for chunk in r.iter_content(chunk_size=1024): 
               if chunk: # filter out keep-alive new chunks
                   f.write(chunk)
                   f.flush()

download_file('https://r2---sn-o3o-qxal.googlevideo.com/videoplayback?key=yt5&sver=3&signature=3D4D50B11C6206B737185B7A9887A72FE356C6DF.87458BB3BF357CEF131BEDF0C0ED3DC08F087646&upn=8n6wa_1gM_o&source=youtube&requiressl=yes&mime=video%2Fmp4&ip=14.139.249.194&expire=1442331973&ratebypass=yes&lmt=1441845538878516&mm=31&ipbits=0&mn=sn-o3o-qxal&pl=24&sparams=dur%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpl%2Cratebypass%2Crequiressl%2Csource%2Cupn%2Cexpire&fexp=9408710%2C9409069%2C9409170%2C9412773%2C9415365%2C9415485%2C9415942%2C9416023%2C9416126%2C9416333%2C9416729%2C9417707%2C9417710%2C9417818%2C9418153%2C9418162%2C9418200%2C9418245%2C9418448%2C9418986%2C9419773%2C9419788%2C9419837%2C9420348%2C9420777%2C9420798&id=o-ALtyLWP7o7PqDhINh6FWp4v4FC8-3pQoZ0UH4COW6v5p&mt=1442310331&dur=8384.609&mv=m&initcwndbps=4191250&ms=au&itag=18&cpn=zzjuCmNROtaupQMW&ptk=Apple%252Bvid&oid=ffsQQyXI443h2PgMzMjp-g&ptchn=E_M8A5yxnLfW0KghEeajjw&pltype=content&c=WEB&cver=html5')
Cœur
  • 37,241
  • 25
  • 195
  • 267
  • 1
    Maybe use `multiprocessing` instead of `threading` for paralysing downloads. – yask Sep 15 '15 at 07:12
  • A server does not support the demand for parts. You cannot speed up downloading videos. Time = file size/receive speed. If supported, the particle size, you must use additional system resources to combine them. You can do as mentioned above. – dsgdfg Sep 15 '15 at 10:53
  • 1
    Questions on stackoverflow are supposed to show what code you've tried and why you think it isn't working. Please update your question with what you have tried. Further, please think very hard about @dsgdfg's comment. Some servers may support requesting chunks, but many may not. – Ian Stapleton Cordasco Sep 15 '15 at 13:42
  • I assume OP wants to download `n` distinct files in parallel. Otherwise yes, what @dsgdfg said applies, but frankly I wouldn't waste my time like this. This is the kind of problem for which if there is an efficient solution, it comes already prepackaged in a module or library. If there is no such library, it's probably too burdensome to write your own, unless you have time and money to waste. – Tobia Tesan Sep 15 '15 at 18:43
  • Actually I have been assigned a project, so that's why I'm working on it. If you could suggest me way please answer – Utkarsh Gupta Sep 16 '15 at 03:05

1 Answers1

1

urllib2 is not thread-safe, so caveats apply if used with threading.

However, urllib3 is indeed thread safe, so you could consider using that.

Alternately, if you have a urllib2-based library that you don't want to rewrite that you would be using for batch downloads you could explore a task queue like Celery or Gearman.

Community
  • 1
  • 1
Tobia Tesan
  • 1,938
  • 17
  • 29