1

I have a python app, where I have a variable that contains multiple urls.

At this moment I use something like this:

for v in arr:
        cmd = 'youtube-dl -u ' + email + ' -p ' + password + ' -o "' + v['path'] + '" ' + v['url']

        os.system(cmd)

But this way I download just one video after another. How can I download, let's say 3 videos at the same time ? (Is not from youtube so no playlist or channels)

I not necessary need multi threading in python, but to call the youtube-dl multiple times, splitting the array. So from a python perspective can be on thread.

user3541631
  • 3,686
  • 8
  • 48
  • 115

2 Answers2

5

Use a Pool:

import multiprocessing.dummy
import subprocess

arr = [
    {'vpath': 'example/%(title)s.%(ext)s', 'url': 'https://www.youtube.com/watch?v=BaW_jenozKc'},
    {'vpath': 'example/%(title)s.%(ext)s', 'url': 'http://vimeo.com/56015672'},
    {'vpath': '%(playlist_title)s/%(title)s-%(id)s.%(ext)s',
     'url': 'https://www.youtube.com/playlist?list=PLLe-WjSmNEm-UnVV8e4qI9xQyI0906hNp'},
]

email = 'my-email@example.com'
password = '123456'

def download(v):
    subprocess.check_call([
        'echo', 'youtube-dl',
        '-u', email, '-p', password,
        '-o', v['vpath'], '--', v['url']])


p = multiprocessing.dummy.Pool(concurrent)
p.map(download, arr)

multiprocessing.dummy.Pool is a lightweight thread-based version of a Pool, which is more suitable here because the work tasks are just starting subprocesses.

Note that instead of os.system, subprocess.check_call, which prevents the command injection vulnerability in your previous code.

Also note that youtube-dl output templates are really powerful. In most cases, you don't actually need to define and manage file names yourself.

3

I achieved the same thing using threading library, which is considered a lighter way to spawn new processes.

Assumption:

  • Each task will download videos to a different directory.
import os
import threading
import youtube_dl

COOKIE_JAR = "path_to_my_cookie_jar"

def download_task(videos, output_dir):

    if not os.path.isdir(output_dir):
        os.makedirs(output_dir)

    if not os.path.isfile(COOKIE_JAR):
        raise FileNotFoundError("Cookie Jar not found\n")

    ydl_opts = { 
        'cookiefile': COOKIE_JAR, 
        'outtmpl': f'{output_dir}/%(title)s.%(ext)s'
    }

    with youtube_dl.YoutubeDL(ydl_opts) as ydl:
        ydl.download(videos)


if __name__ == "__main__":

    output_dir = "./root_dir"

    threads = []
    for playlist in many_playlists:
        output_dir = f"{output_dir}/playlist.name"
        thread = threading.Thread(target=download_task, args=(playlist, output_dir))
        threads.append(thread)
    
    # Actually start downloading
    for thread in threads:
        thread.start()
     
    # Wait for all the downloads to complete
    for thread in threads: 
        thread.join()
kohane15
  • 809
  • 12
  • 16
Dat
  • 5,405
  • 2
  • 31
  • 32