How can I extract video ID from YouTube's link in Python?

Question

I know this can be easily done using PHP's parse_url and parse_str functions:

$subject = "http://www.youtube.com/watch?v=z_AbfPXTKms&NR=1";
$url = parse_url($subject);
parse_str($url['query'], $query);
var_dump($query);

But how to achieve this using Python? I can do urlparse but what next?

score 66 · Answer 1 · edited Jun 23 '20 at 09:47

66

I've created youtube id parser without regexp:

import urlparse

def video_id(value):
    """
    Examples:
    - http://youtu.be/SA2iWivDJiE
    - http://www.youtube.com/watch?v=_oPAwA_Udwc&feature=feedu
    - http://www.youtube.com/embed/SA2iWivDJiE
    - http://www.youtube.com/v/SA2iWivDJiE?version=3&amp;hl=en_US
    """
    query = urlparse.urlparse(value)
    if query.hostname == 'youtu.be':
        return query.path[1:]
    if query.hostname in ('www.youtube.com', 'youtube.com'):
        if query.path == '/watch':
            p = urlparse.parse_qs(query.query)
            return p['v'][0]
        if query.path[:7] == '/embed/':
            return query.path.split('/')[2]
        if query.path[:3] == '/v/':
            return query.path.split('/')[2]
    # fail?
    return None

edited Jun 23 '20 at 09:47

Erik Cederstrand

9,643
8
39
63

answered Oct 29 '11 at 02:04

Mikhail Kashkin

1,521
14
29

2

This one is great for parsing all of the possible youtube link formats. – Lexo May 03 '14 at 01:52
1

you can use `query.path.startswith('/embed/')` for added legibility. – Capi Etheriel Oct 11 '14 at 06:07
The above solution works well except for one scenario. https://m.youtube.com/?#/watch?v=683hzaj3oc8 It would be really helpful if I get solution for the above scenario also. – kirans_6891 May 07 '15 at 10:26
1

"i will finish what you started" ;) :: https://gist.github.com/kmonsoor/2a1afba4ee127cce50a0 – kmonsoor Jan 03 '16 at 20:38

robert · Accepted Answer · 2010-12-05T00:09:37.050

47

Python has a library for parsing URLs.

import urlparse
url_data = urlparse.urlparse("http://www.youtube.com/watch?v=z_AbfPXTKms&NR=1")
query = urlparse.parse_qs(url_data.query)
video = query["v"][0]

edited Dec 05 '10 at 00:09

answered Dec 05 '10 at 00:02

robert

33,242
8
53
74

`I can do urlparse but what next?` yeah I know, but problem is with the query part. – decarbo Dec 05 '10 at 00:07
@decarbo The updated answer shows you how to extract just the value of the `v` parameter in the query string. – Phrogz Dec 05 '10 at 05:53
yap, that's the best solution I guess. – decarbo Dec 05 '10 at 19:32
FYI this will not work when submitting `youtube.com/watch?v=hP54ne1COvY` because its missing the http – JiminyCricket Jul 08 '14 at 19:18
5

please note `urlparse` was moved to `urllib.parse` in Python3 Something along the lines of this would do the trick: `import urllib.parse as urlparse` – Ibrahim Tayseer Apr 28 '20 at 20:32

Elijah · Answer 3 · 2022-01-04T14:33:12.763

This is the Python3 version of Mikhail Kashkin's solution with added scenarios.

from urllib.parse import urlparse, parse_qs
from contextlib import suppress


# noinspection PyTypeChecker
def get_yt_id(url, ignore_playlist=False):
    # Examples:
    # - http://youtu.be/SA2iWivDJiE
    # - http://www.youtube.com/watch?v=_oPAwA_Udwc&feature=feedu
    # - http://www.youtube.com/embed/SA2iWivDJiE
    # - http://www.youtube.com/v/SA2iWivDJiE?version=3&amp;hl=en_US
    query = urlparse(url)
    if query.hostname == 'youtu.be': return query.path[1:]
    if query.hostname in {'www.youtube.com', 'youtube.com', 'music.youtube.com'}:
        if not ignore_playlist:
        # use case: get playlist id not current video in playlist
            with suppress(KeyError):
                return parse_qs(query.query)['list'][0]
        if query.path == '/watch': return parse_qs(query.query)['v'][0]
        if query.path[:7] == '/watch/': return query.path.split('/')[1]
        if query.path[:7] == '/embed/': return query.path.split('/')[2]
        if query.path[:3] == '/v/': return query.path.split('/')[2]
   # returns None for invalid YouTube url

Alex · Answer 4 · 2015-09-26T19:06:49.023

13

Here is RegExp it cover these cases enter image description here

((?<=(v|V)/)|(?<=be/)|(?<=(\?|\&)v=)|(?<=embed/))([\w-]+)

edited Sep 26 '15 at 19:06

answered Jan 06 '15 at 09:55

Alex

1,603
1
16
11

1

to get this to work in python, i had to correct the syntax too: `((?<=(v|V)/)|(?<=be/)|(?<=(\?|\&)v=)|(?<=embed/))([\w-]+)`. This solution ended up being the one that handled the most cases. – Gus E Aug 06 '15 at 21:10
`/((?<=(v|e|V|vi)\/)|(?<=be\/)|(?<=(\?|\&)v=)|(?<=\/u\/\d+\/)|(?<=(\?|\&)vi=)|(?<=embed\/))([\w-]+)/gi;` Compatible with most at https://gist.github.com/rodrigoborgesdeoliveira/987683cfbfcc8d800192da1e73adc486 – SilverIce May 07 '19 at 06:15

ivansaul · Answer 5 · 2021-02-24T23:11:24.220

5

I use this great package pytube.$ pip install pytube

#Examples
url1='http://youtu.be/SA2iWivDJiE'
url2='http://www.youtube.com/watch?v=_oPAwA_Udwc&feature=feedu'
url3='http://www.youtube.com/embed/SA2iWivDJiE'
url4='http://www.youtube.com/v/SA2iWivDJiE?version=3&amp;hl=en_US'
url5='https://www.youtube.com/watch?v=rTHlyTphWP0&index=6&list=PLjeDyYvG6-40qawYNR4juzvSOg-ezZ2a6'
url6='youtube.com/watch?v=_lOT2p_FCvA'
url7='youtu.be/watch?v=_lOT2p_FCvA'
url8='https://www.youtube.com/watch?time_continue=9&v=n0g-Y0oo5Qs&feature=emb_logo'

urls=[url1,url2,url3,url4,url5,url6,url7,url8]

#Get youtube id
from pytube import extract
for url in urls:
    id=extract.video_id(url)
    print(id)

Output

SA2iWivDJiE
_oPAwA_Udwc
SA2iWivDJiE
SA2iWivDJiE
rTHlyTphWP0
_lOT2p_FCvA
_lOT2p_FCvA
n0g-Y0oo5Qs

edited Feb 24 '21 at 23:11

answered Feb 24 '21 at 22:36

ivansaul

179
2
6

this is the best answer by far, as it accounts for all url types. none of the other answers do – Mike Johnson Jr Mar 04 '22 at 03:48
It's a very good idea to use a library for this, but at the end it uses a regex, and a very simple one https://github.com/pytube/pytube/blob/master/pytube/extract.py#L118 – ruloweb Oct 20 '22 at 18:54

score 4 · Answer 6 · answered Dec 05 '10 at 00:20

4

match = re.search(r"youtube\.com/.*v=([^&]*)", "http://www.youtube.com/watch?v=z_AbfPXTKms&test=123")
if match:
    result = match.group(1)
else:
    result = ""

Untested.

answered Dec 05 '10 at 00:20

Robin Orheden

2,714
23
24

score 3 · Answer 7 · answered Aug 30 '20 at 22:53

3

You can use

from urllib.parse import urlparse

url_data = urlparse("https://www.youtube.com/watch?v=RG9TMn1FJzc")
print(url_data.query[2::])

answered Aug 30 '20 at 22:53

Mouhcine MIMYA

627
6
8

score 2 · Answer 8 · answered Dec 05 '10 at 00:18

2

Here is something you could try using regex for the youtube video ID:

# regex for the YouTube ID: "^[^v]+v=(.{11}).*"
result = re.match('^[^v]+v=(.{11}).*', url)
print result.group(1)

answered Dec 05 '10 at 00:18

VKolev

855
11
25

This answer is from 2010, but the regex can be modified to match this pattern too. `be[/](.{11}).*` – VKolev Dec 06 '19 at 07:55

score 1 · Answer 9 · answered Dec 05 '10 at 00:00

1

No need for regex. Split on ?, take the second, split on =, take the second, split on &, take the first.

answered Dec 05 '10 at 00:00

thejh

44,854
16
96
107

work. Do you have any idea if this method is bulletproof enough to be used without bigger worries in market-ready projects ? – decarbo Dec 05 '10 at 00:06
7

use urlparse for this. don't roll your own with string splitting or regexes. http://docs.python.org/library/urlparse.html – Corey Goldberg Dec 05 '10 at 00:09
urlparse gives query as a whole so still I need to split it to get ID – decarbo Dec 05 '10 at 01:38

score 1 · Answer 10 · answered Jul 28 '21 at 07:50

1

Splitting strings is a really bad idea when those parameters could come in any order. Stick with urlparse:

from urllib.parse import parse_qs, urlparse

vid = parse_qs(urlparse(url).query).get('v')

answered Jul 28 '21 at 07:50

Rich - enzedonline

746
4
10

score 0 · Answer 11 · edited Mar 01 '20 at 14:52

0

Although this will take a search query but gives you the id:

from youtube_search import YoutubeSearch    
results = YoutubeSearch('search terms', max_results=10).to_json()    
print(results)

edited Mar 01 '20 at 14:52

Michel_T.

2,741
5
21
31

answered Mar 01 '20 at 14:13

Tanishq Vyas

1,422
1
12
25

score 0 · Answer 12 · edited Aug 31 '20 at 02:29

0

url = "http://www.youtube.com/watch?v=z_AbfPXTKms&NR=1"
parsed = url.split("?")
videoId = parsed[1]
print(videoId)

This will work for all kinds of YouTube video links.

edited Aug 31 '20 at 02:29

Adrian Mole

49,934
160
51
83

answered Aug 30 '20 at 23:02

imcupcakegamer

24
1

Anuj Kumar · Answer 13 · 2021-07-03T14:26:44.027

I am very late, but I use this snippet to get the video id.

def video_id(url: str) -> str:
    """Extract the ``video_id`` from a YouTube url.
    This function supports the following patterns:
    - :samp:`https://youtube.com/watch?v={video_id}`
    - :samp:`https://youtube.com/embed/{video_id}`
    - :samp:`https://youtu.be/{video_id}`
    :param str url:
        A YouTube url containing a video id.
    :rtype: str
    :returns:
        YouTube video id.
    """
    return regex_search(r"(?:v=|\/)([0-9A-Za-z_-]{11}).*", url, group=1)

def regex_search(pattern: str, string: str, group: int):
    """Shortcut method to search a string for a given pattern.
    :param str pattern:
        A regular expression pattern.
    :param str string:
        A target string to search.
    :param int group:
        Index of group to return.
    :rtype:
        str or tuple
    :returns:
        Substring pattern matches.
    """
    regex = re.compile(pattern)
    results = regex.search(string)
    if not results:
        return False

    return results.group(group)

score 0 · Answer 14 · answered Mar 02 '22 at 09:44

0

I use this

def getId(videourl):
    vidid=videourl.find('watch?v=')
    Id = videourl[vidid+8:vidid+19]
    if vidid==-1:
        vidid=videourl.find('be/')
        Id=videourl[vidid+3:]
    return Id

answered Mar 02 '22 at 09:44

nopenope

21
1

Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Mar 02 '22 at 11:34

How can I extract video ID from YouTube's link in Python?

14 Answers14

Linked