33

I am coding a functionality that allows users to enter a Youtube video URL. I would like to extract the video ID from these urls.

Does Youtube API support some kind of function where I pass the link and it gives the video ID in return. Or do I have to parse the string myself?

I am using PHP ... I would appreciate any pointers / code samples in this regard.

Thanks

hakre
  • 193,403
  • 52
  • 435
  • 836
Gabriel Spiteri
  • 4,896
  • 12
  • 43
  • 58
  • Related: http://stackoverflow.com/questions/3452546/javascript-regex-how-to-get-youtube-video-id-from-url – hakre Jul 02 '11 at 11:02
  • possible duplicate of [php regex - find all youtube video ids in string](http://stackoverflow.com/questions/5830387/php-regex-find-all-youtube-video-ids-in-string) – hakre Jul 02 '11 at 11:05
  • @hakre that post is related to parsing the string in JS. I am interested in using the Youtube API to extract the Video ID. The PHP related post looks interesting – Gabriel Spiteri Jul 02 '11 at 11:15
  • I don't know if the API offers such, for the PHP related post I've added some code as an answer below. – hakre Jul 02 '11 at 11:18
  • I added another code example that displays the information provided by the oembed youtube API. This might be helpful for you, but it's still no direct match for any of the youtube APIs. – hakre Jul 02 '11 at 11:32

9 Answers9

81

Here is an example function that uses a regular expression to extract the youtube ID from a URL:

/**
 * get youtube video ID from URL
 *
 * @param string $url
 * @return string Youtube video id or FALSE if none found. 
 */
function youtube_id_from_url($url) {
    $pattern = 
        '%^# Match any youtube URL
        (?:https?://)?  # Optional scheme. Either http or https
        (?:www\.)?      # Optional www subdomain
        (?:             # Group host alternatives
          youtu\.be/    # Either youtu.be,
        | youtube\.com  # or youtube.com
          (?:           # Group path alternatives
            /embed/     # Either /embed/
          | /v/         # or /v/
          | /watch\?v=  # or /watch\?v=
          )             # End path alternatives.
        )               # End host alternatives.
        ([\w-]{10,12})  # Allow 10-12 for 11 char youtube id.
        $%x'
        ;
    $result = preg_match($pattern, $url, $matches);
    if ($result) {
        return $matches[1];
    }
    return false;
}

echo youtube_id_from_url('http://youtu.be/NLqAF9hrVbY'); # NLqAF9hrVbY

It's an adoption of the answer from a similar question.


It's not directly the API you're looking for but probably helpful. Youtube has an oembed service:

$url = 'http://youtu.be/NLqAF9hrVbY';
var_dump(json_decode(file_get_contents(sprintf('http://www.youtube.com/oembed?url=%s&format=json', urlencode($url)))));

Which provides some more meta-information about the URL:

object(stdClass)#1 (13) {
  ["provider_url"]=>
  string(23) "http://www.youtube.com/"
  ["title"]=>
  string(63) "Hang Gliding: 3 Flights in 8 Days at Northside Point of the Mtn"
  ["html"]=>
  string(411) "<object width="425" height="344"><param name="movie" value="http://www.youtube.com/v/NLqAF9hrVbY?version=3"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/NLqAF9hrVbY?version=3" type="application/x-shockwave-flash" width="425" height="344" allowscriptaccess="always" allowfullscreen="true"></embed></object>"
  ["author_name"]=>
  string(11) "widgewunner"
  ["height"]=>
  int(344)
  ["thumbnail_width"]=>
  int(480)
  ["width"]=>
  int(425)
  ["version"]=>
  string(3) "1.0"
  ["author_url"]=>
  string(39) "http://www.youtube.com/user/widgewunner"
  ["provider_name"]=>
  string(7) "YouTube"
  ["thumbnail_url"]=>
  string(48) "http://i3.ytimg.com/vi/NLqAF9hrVbY/hqdefault.jpg"
  ["type"]=>
  string(5) "video"
  ["thumbnail_height"]=>
  int(360)
}

But the ID is not a direct part of the response. However it might contain the information you're looking for and it might be useful to validate the youtube URL.

Community
  • 1
  • 1
hakre
  • 193,403
  • 52
  • 435
  • 836
  • 1
    Thanks hakre this worked :) ... just a small adjustment needed since there is a small mistake in the regex ... (\w{10-12}) should be ([\w-]{10,12}) – Gabriel Spiteri Jul 02 '11 at 11:44
  • 1
    This only works if the input URL ends immediately after the [youtube id]. If the URL contains any params (& or ?) beyond the [youtube id], the regex function returns false... Any way to check for that at the end of the pattern and ignore further characters? – atwixtor Apr 27 '12 at 18:23
  • 1
    parse_url as an alternative would work (with loops and regex to check each component), but I think I'll just use the [original version](http://stackoverflow.com/questions/5830387/php-regex-find-all-youtube-video-ids-in-string/5831191#5831191) from which you adapted, minus the `` replacement bit. Thanks! – atwixtor May 03 '12 at 22:30
  • 3
    Does not work with: http://www.youtube.com/embed/SOME-ID?autoplay=1 or http://www.youtube.com/watch?v=some-id with "&feature=youtu.be" at the end. – FooBar Mar 02 '14 at 14:38
  • Should probably just drop the $ off the end of the regex. There are a lot of params that can get tacked on after the video ID. – wizzard Jan 15 '15 at 22:04
  • When you add this on top of the function, for when it concerns an embed/-url, it will work: $url = trim(strtok("$url", '?')); $url = str_replace("#!/", "", "$url"); – KJS Dec 09 '16 at 22:43
19

I am making slight changes in the above regular expression, although it is working fine for youtube short URL (which have been used in the above example) and simple video URL where no other parameter is coming after video code, but it does not work for URLs like http://www.youtube.com/watch?v=B_izAKQ0WqQ&feature=related as video code is not the last parameter in this URL. In the same way v={video_code} does not always come after watch (whereas above regular expression is assuming that it will always come after watch?), like if user has selected language OR location from the footer, for example if user has selected English (UK) from Language option then URL will be http://www.youtube.com/watch?feature=related&hl=en-GB&v=B_izAKQ0WqQ

So I have made some modification in the above regular expressions, but definitely credit goes to hakre for providing the base regular expression, thanks @hakre:

function youtube_id_from_url($url) {
   $pattern =
    '%^# Match any youtube URL
    (?:https?://)?  # Optional scheme. Either http or https
    (?:www\.)?      # Optional www subdomain
    (?:             # Group host alternatives
      youtu\.be/    # Either youtu.be,
    | youtube\.com  # or youtube.com
      (?:           # Group path alternatives
        /embed/     # Either /embed/
      | /v/         # or /v/
      | .*v=        # or /watch\?v=
      )             # End path alternatives.
    )               # End host alternatives.
    ([\w-]{10,12})  # Allow 10-12 for 11 char youtube id.
    ($|&).*         # if additional parameters are also in query string after video id.
    $%x'
    ;
    $result = preg_match($pattern, $url, $matches);
    if (false !== $result) {
      return $matches[1];
    }
    return false;
 }
Sabeeh Chaudhry
  • 6,244
  • 1
  • 16
  • 8
  • 1
    Excellent, but this needs an additional trim() to be perfect! People tend to post stuff like that into forms with whitespaces in the end (or before). – Sliq Jan 03 '14 at 09:45
  • Group host alternatives should end with ")" before open Group path alternatives. Otherwise with youtu\.be|youtube\.com|youtube\.de will only find for last one. – user706420 Apr 12 '17 at 15:22
  • Looks like the hash here this is not getting matched: https://www.youtube.com/watch?v=-np5iMKQaDw – Mike May 06 '21 at 17:00
10

You can use the PHP function parse_url to extract host name, path, query string and the fragment. You can then use PHP string functions to locate the video id.

function getYouTubeVideoId($url)
{
    $video_id = false;
    $url = parse_url($url);
    if (strcasecmp($url['host'], 'youtu.be') === 0)
    {
        #### (dontcare)://youtu.be/<video id>
        $video_id = substr($url['path'], 1);
    }
    elseif (strcasecmp($url['host'], 'www.youtube.com') === 0)
    {
        if (isset($url['query']))
        {
            parse_str($url['query'], $url['query']);
            if (isset($url['query']['v']))
            {
                #### (dontcare)://www.youtube.com/(dontcare)?v=<video id>
                $video_id = $url['query']['v'];
            }
        }
        if ($video_id == false)
        {
            $url['path'] = explode('/', substr($url['path'], 1));
            if (in_array($url['path'][0], array('e', 'embed', 'v')))
            {
                #### (dontcare)://www.youtube.com/(whitelist)/<video id>
                $video_id = $url['path'][1];
            }
        }
    }
    return $video_id;
}
$urls = array(
    'http://youtu.be/dQw4w9WgXcQ',
    'http://www.youtube.com/?v=dQw4w9WgXcQ',
    'http://www.youtube.com/?v=dQw4w9WgXcQ&feature=player_embedded',
    'http://www.youtube.com/watch?v=dQw4w9WgXcQ',
    'http://www.youtube.com/watch?v=dQw4w9WgXcQ&feature=player_embedded',
    'http://www.youtube.com/v/dQw4w9WgXcQ',
    'http://www.youtube.com/e/dQw4w9WgXcQ',
    'http://www.youtube.com/embed/dQw4w9WgXcQ'
);
foreach ($urls as $url)
{
    echo sprintf('%s -> %s' . PHP_EOL, $url, getYouTubeVideoId($url));
}
Salman A
  • 262,204
  • 82
  • 430
  • 521
1

Simple as return substr(strstr($url, 'v='), 2, 11);

  • Relevant to this answer: [Are YouTube codes guaranteed to always be 11 characters?](http://webapps.stackexchange.com/q/13854/35021) –  Oct 21 '14 at 11:09
  • in the link it says yes –  Dec 15 '14 at 10:24
  • Yes, the general consensus on the link in my earlier comment is that only, but unless Google itself commits to that in writing, we should not build apps with that assumption. Since your answer relies on that, so I thought the reader should be aware of all scenarios. –  Dec 15 '14 at 12:32
1

I know this is a very late answer but I found this thread while searching for the topic so I want to suggest a more elegant way of doing this using oEmbed:

echo get_embed('youtube', 'https://www.youtube.com/watch?v=IdxKPCv0bSs');

function get_embed($provider, $url, $max_width = '', $max_height = ''){
    $providers = array(
        'youtube' => 'http://www.youtube.com/oembed'
        /* you can add support for more providers here */
    );

    if(!isset($providers[$provider])){
        return 'Invalid provider!';
    }

    $movie_data_json = @file_get_contents(
        $providers[$provider] . '?url=' . urlencode($url) . 
        "&maxwidth={$max_width}&maxheight={$max_height}&format=json"
    );

    if(!$movie_data_json){
        $error = error_get_last();
        /* remove the PHP stuff from the error and show only the HTTP error message */
        $error_message = preg_replace('/.*: (.*)/', '$1', $error['message']);
        return $error_message;
    }else{
        $movie_data = json_decode($movie_data_json, true);
        return $movie_data['html'];
    }
}

oEmbed makes it possible to embed content from more sites by just adding their oEmbed API endpoint to the $providers array in the above code.

A. Genedy
  • 578
  • 4
  • 11
1

Here is a simple solution that has worked for me.

VideoId is the longest word in any YouTube URL types and it comprises (alphanumeric + "-") with minimum length of 8 surrounded by non-word chars. So you can search for below regex in the URL as a group and that first group is your answer. First group because some youtube parameters such as enablejsapi are more than 8 chars but they always come after videoId.

Regex: "\W([\w-]{9,})(\W|$)"

Here is the working java code:

String[] youtubeUrls = {
    "https://www.youtube.com/watch?v=UzRtrjyDwx0",
    "https://youtu.be/6butf1tEVKs?t=22s",
    "https://youtu.be/R46-XgqXkzE?t=2m52s",
    "http://youtu.be/dQw4w9WgXcQ",
    "http://www.youtube.com/?v=dQw4w9WgXcQ",
    "http://www.youtube.com/?v=dQw4w9WgXcQ&feature=player_embedded",
    "http://www.youtube.com/watch?v=dQw4w9WgXcQ",
    "http://www.youtube.com/watch?v=dQw4w9WgXcQ&feature=player_embedded",
    "http://www.youtube.com/v/dQw4w9WgXcQ",
    "http://www.youtube.com/e/dQw4w9WgXcQ",
    "http://www.youtube.com/embed/dQw4w9WgXcQ"
};

String pattern = "\\W([\\w-]{9,})(\\W|$)";
Pattern pattern2 = Pattern.compile(pattern);

for (int i=0; i<youtubeUrls.length; i++){
    Matcher matcher2 = pattern2.matcher(youtubeUrls[i]);
    if (matcher2.find()){
        System.out.println(matcher2.group(1));
    }
    else System.out.println("Not found");
}
dsbajna
  • 800
  • 5
  • 9
1

As mentioned in a comment below the valid answer, we use it like this, and it works mighty fine!

function youtube_id_from_url($url) {

$url = trim(strtok("$url", '?'));
$url = str_replace("#!/", "", "$url");

    $pattern = 
        '%^# Match any youtube URL
        (?:https?://)?  # Optional scheme. Either http or https
        (?:www\.)?      # Optional www subdomain
        (?:             # Group host alternatives
          youtu\.be/    # Either youtu.be,
        | youtube\.com  # or youtube.com
          (?:           # Group path alternatives
            /embed/     # Either /embed/
          | /v/         # or /v/
          | /watch\?v=  # or /watch\?v=
          )             # End path alternatives.
        )               # End host alternatives.
        ([\w-]{10,12})  # Allow 10-12 for 11 char youtube id.
        $%x'
        ;
    $result = preg_match($pattern, $url, $matches);
    if ($result) {
        return $matches[1];
    }
    return false;
}
KJS
  • 1,176
  • 1
  • 13
  • 29
0

How about this one:

function getVideoId() {
    $query = parse_url($this->url, PHP_URL_QUERY);

    $arr = explode('=', $query);

    $index = array_search('v', $arr);

    if ($index !== false) {
        if (isset($arr[$index++])) {
            $string = $arr[$index++];
            if (($amp = strpos($string, '&')) !== false) {
                return substr($string, 0, $amp);
            } else {
                return $string;
            }
        } else {
            return false;
        }
    }
    return false;
}

No regex, support multiple query parameters, i.e, https://www.youtube.com/watch?v=PEQxWg92Ux4&index=9&list=RDMMom0RGEnWIEk also works.

Khanh Tran
  • 1,776
  • 5
  • 25
  • 48
0

For JAVA developers

Got this working for me, also supports no-cookie url's:

    private static final Pattern youtubeId = Pattern.compile("^(?:https?\\:\\/\\/)?.*(?:youtu.be\\/|vi?\\/|vi?=|u\\/\\w\\/|embed\\/|(watch)?vi?=)([^#&?]*).*$");


    @VisibleForTesting
    String getVideoId(final String url) {
        final Matcher matcher = youtubeId.matcher(url);
        if(matcher.find()){
            return matcher.group(2);
        }
        return "";
    }

Some test to check youtube url's

    @ParameterizedTest
    @MethodSource("youtubeTestUrls")
    void videoIdFromUrlTest(final String url, final String videoId) {

        final String matchedVidID = this.youtubeService.getVideoId(url);

        assertEquals(videoId, matchedVidID);
    }

    private static Stream<Arguments> youtubeTestUrls() {
        return Stream.of(
                Arguments.of("www.youtube-nocookie.com/embed/dQw4-9W_XcQ?rel=0", "dQw4-9W_XcQ"),
                Arguments.of("http://www.youtube.com/user/Scobleizer#p/u/1/dQw4-9W_XcQ", "dQw4-9W_XcQ"),
                Arguments.of("http://www.youtube.com/watch?v=dQw4-9W_XcQ&feature=channel", "dQw4-9W_XcQ"),
                Arguments.of("http://www.youtube.com/watch?v=dQw4-9W_XcQ&playnext_from=TL&videos=osPknwzXEas&feature=sub", "dQw4-9W_XcQ"),
                Arguments.of("http://www.youtube.com/ytscreeningroom?v=dQw4-9W_XcQ", "dQw4-9W_XcQ"),
                Arguments.of("http://www.youtube.com/user/SilkRoadTheatre#p/a/u/2/dQw4-9W_XcQ", "dQw4-9W_XcQ"),
                Arguments.of("http://youtu.be/dQw4-9W_XcQ", "dQw4-9W_XcQ"),
                Arguments.of("http://www.youtube.com/watch?v=dQw4-9W_XcQ&feature=youtu.be", "dQw4-9W_XcQ"),
                Arguments.of("http://youtu.be/dQw4-9W_XcQ", "dQw4-9W_XcQ"),
                Arguments.of("https://www.youtube.com/user/Scobleizer#p/u/1/dQw4-9W_XcQ?rel=0", "dQw4-9W_XcQ"),
                Arguments.of("http://www.youtube.com/watch?v=dQw4-9W_XcQ&playnext_from=TL&videos=dQw4-9W_XcQ&feature=sub", "dQw4-9W_XcQ"),
                Arguments.of("http://www.youtube.com/ytscreeningroom?v=dQw4-9W_XcQ", "dQw4-9W_XcQ"),
                Arguments.of("http://www.youtube.com/embed/dQw4-9W_XcQ?rel=0", "dQw4-9W_XcQ"),
                Arguments.of("https://www.youtube.com/watch?v=dQw4-9W_XcQ", "dQw4-9W_XcQ"),
                Arguments.of("http://youtube.com/v/dQw4-9W_XcQ?feature=youtube_gdata_player", "dQw4-9W_XcQ"),
                Arguments.of("http://youtube.com/vi/dQw4-9W_XcQ?feature=youtube_gdata_player", "dQw4-9W_XcQ"),
                Arguments.of("http://youtube.com/?v=dQw4-9W_XcQ&feature=youtube_gdata_player", "dQw4-9W_XcQ"),
                Arguments.of("http://www.youtube.com/watch?v=dQw4-9W_XcQ&feature=youtube_gdata_player", "dQw4-9W_XcQ"),
                Arguments.of("http://youtube.com/?vi=dQw4-9W_XcQ&feature=youtube_gdata_player", "dQw4-9W_XcQ"),
                Arguments.of("https://youtube.com/watch?v=dQw4-9W_XcQ&feature=youtube_gdata_player", "dQw4-9W_XcQ"),
                Arguments.of("http://youtube.com/watch?vi=dQw4-9W_XcQ&feature=youtube_gdata_player", "dQw4-9W_XcQ"),
                Arguments.of("http://youtu.be/dQw4-9W_XcQ?feature=youtube_gdata_player", "dQw4-9W_XcQ"),
                Arguments.of("https://www.youtube.com/watch?v=yYw2Q141thM&list=PLOwEeBApnYoUFioRitjwz-DREzFGOSgiE&index=2", "yYw2Q141thM"),
                Arguments.of("https://www.youtube.com/watch?", "")
        );
    }
BarbetNL
  • 408
  • 2
  • 16