2

So I want to download a file, but I dont need all of it. Is it possible that I skip the first 1/4 of the file and download the rest?

I have tried the python youtube-dl package, there are some relevant flags that I think might work. But I dont know how to use them

So anyway, if anyone has attempted this before, would you mind sharing how you go about it. Or is it even possible?

qwdh
  • 85
  • 1
  • 7

3 Answers3

5

Someone has suggested using a combination of ffmpeg and youtube-dl to do exactly what you want here:

https://askubuntu.com/questions/970629/how-to-download-a-portion-of-a-video-with-youtube-dl-or-something-else

Here is a suggested example, as is, from the link above. As you can see, youtube-dl is only used to fetch the video URL, ffmpeg does the job:

ffmpeg -i $(youtube-dl -f 22 --get-url https://www.youtube.com/watch?v=ZbZSe6N_BXs) -ss 00:00:10 -t 00:00:30 -c:v copy -c:a copy happy.mp4

Example that launches ffmpeg in a similar way and downloads a piece of a 3h video:

import youtube_dl, subprocess

URL = "https://www.youtube.com/watch?v=eyU3bRy2x44"
FROM = "00:00:15"
TO = "00:00:25"
TARGET = "demo.mp4"

with youtube_dl.YoutubeDL({'format': 'best'}) as ydl:
    result = ydl.extract_info(URL, download=False)
    video = result['entries'][0] if 'entries' in result else result

url = video['url']
subprocess.call('ffmpeg -i "%s" -ss %s -t %s -c:v copy -c:a copy "%s"' % (url, FROM, TO, TARGET))
Boris Lipschitz
  • 1,514
  • 8
  • 12
  • Hmmm, but did you notice, that the suggestion in the answer downloads the whole video to recode it then? That's not exactly, what he was asking for (btw. also the other guy not). – jottbe Jul 21 '19 at 15:42
  • But it really doesn't, i've just tried. Can't really do the command as is on Windows environment, but this is what i did. Fetched the URL first by running youtube-dl -f best --get-url https://www.youtube.com/watch?v=ZbZSe6N_BXs > url.txt then used ffmpeg with this URL (copy-pasted form url.txt) and it worked perfectly, downloading just the part neccessary. – Boris Lipschitz Jul 21 '19 at 17:26
  • I've edited the answer to include a piece of python code that uses youtube_dl and launches ffmpeg command line. It works just fine, downloading just a few mb out of the 3 hours video. – Boris Lipschitz Jul 21 '19 at 17:54
  • Wow, what a fantastic solution. Thank you so much – qwdh Jul 22 '19 at 23:01
  • I would like to upvote you, but I don't have 15 reputation points yet. I just want to thank you again. – qwdh Jul 22 '19 at 23:04
  • Don't worry about up-voting. Although, you probably should "accept" the answer (if it lets you), cause using a combination of youtube-dl and ffmpeg is probably the best you are going to get. Anyway, the idea wasn't mine,but i've linked to the place i took it from. Hope it helps :-) – Boris Lipschitz Jul 22 '19 at 23:22
2

Edit: something i've kinda missed (sorry) is the fact you aren't just trying to download a piece of binary file, but trying to get a piece of youtube video. My answer below doesn't really apply, you can't just pick a binary piece out of a video and expect it to work, at least not with most container types out of the box.

Original answer: The answer is "maybe you can". It depends on the server, which may or may not support partial downloads. Read more info here: https://developer.mozilla.org/en-US/docs/Web/HTTP/Range_requests

The only thing you have to do in case it is, indeed, supported, is to add a range header. Python example, that pulls second 1kb chunk out of a file below.

import urllib.request

url = 'http://ipv4.download.thinkbroadband.com/100MB.zip'
req = urllib.request.Request(url)
req.add_header('User-Agent', 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:68.0) Gecko/20100101 Firefox/68.0')
req.add_header('Range', 'bytes=1024-2047') # <=== range header
res = urllib.request.urlopen(req)
with open('test.bin', 'wb') as f:
    f.write(res.read())
Boris Lipschitz
  • 1,514
  • 8
  • 12
  • Very interesting. What is the `add_header('User-Agent...`-line for? is it to mock browser interaction? – jottbe Jul 21 '19 at 07:49
  • 1
    User agent tells the server what browser it deals with. This server wouldn't let me download with default python's one. – Boris Lipschitz Jul 21 '19 at 14:25
  • Thanks. I usually use `requests`, but it also seems to be possible there. This could probably have safed me some time, when I scraped some websites that were _mean enough_ to reject my requests. – jottbe Jul 22 '19 at 08:38
0

Boris' point is good. video files usually have a header, that includes info like the size of the video, the frame rate and other info.

Additionally frames are usually not stored completely but instead only the changes in the frames to safe lots of space. If you download just starting from byte x to byte y you miss the header and you can't be sure you get a boundary of a frame.

But if you want to download only a certain part of the Youtube video until the end, you just need to know the second where the part begins, you are interested in. Then you just need to change the URL a bit and add &t=x to where x is the starting second (integer).

So if you want the rest of this video starting from 02:25

https://www.youtube.com/watch?v=LUk73pUe9i4

It becomes:

https://youtu.be/LUk73pUe9i4?t=125

or, which seems to give the same result:

https://www.youtube.com/watch?v=LUk73pUe9i4&feature=youtu.be&t=125

It should be possible to use this as a url in the library you use.

But I don't know if there also is a variable for the duration, if that is relevant for you. But I think, you just wanted the rest of the video, right?

But if it is relevant, you still could calculate the end (which could be difficult, because there is not really a straight linear dependency between length in seconds and size in bytes), or (but this would get a bit messy) say if you want to download the video part between second 100 and 500. Start downloading the first say 5MB starting at second 500. Throw away enough bytes to drop the header and the initial frame and use the rest of the bytes as a "stop"-pattern. So you would start downloading from 100 and as soon as you find your pattern, you know, that you are past second 500. Yeah I said it will get messy :-)

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
jottbe
  • 4,228
  • 1
  • 15
  • 31
  • See my second answer, you can actually do this easily enough with ffmpeg, that decides what parts of video to download, and deals with the container on it's own. – Boris Lipschitz Jul 21 '19 at 17:57