65

I am attempting to parse the video ID of a youtube URL using preg_match. I found a regular expression on this site that appears to work;

(?<=v=)[a-zA-Z0-9-]+(?=&)|(?<=[0-9]/)[^&\n]+|(?<=v=)[^&\n]+

As shown in this pic:

alt text

My PHP is as follows, but it doesn't work (gives Unknown modifier '[' error)...

<?
 $subject = "http://www.youtube.com/watch?v=z_AbfPXTKms&NR=1";

 preg_match("(?<=v=)[a-zA-Z0-9-]+(?=&)|(?<=[0-9]/)[^&\n]+|(?<=v=)[^&\n]+", $subject, $matches);

 print "<pre>";
 print_r($matches);
 print "</pre>";

?>

Cheers

hakre
  • 193,403
  • 52
  • 435
  • 836
J.C
  • 1,409
  • 2
  • 19
  • 32
  • 1
    In your RegexBuddy, you have Java as the selected language. There is also a Use tab that you can click on that will give you properly escaped code to use for a number of different situations. – Benjam Jul 01 '11 at 03:25
  • 1
    See as well: [php regex - find all youtube video ids in string](http://stackoverflow.com/q/5830387/367456) – hakre Jun 02 '12 at 12:33
  • 1
    Related: http://stackoverflow.com/questions/5830387/how-to-find-all-youtube-video-ids-in-a-string-using-a-regex/5831191#5831191 – Timo Huovinen Mar 12 '14 at 11:34
  • Because the other question have a best answer, well explained. – Toto Feb 08 '20 at 17:27
  • @Toto it also fails to match in some cases, if you see latest comments - so not exactly the better answer – J.C Feb 08 '20 at 19:49

10 Answers10

249

This regex grabs the ID from all of the various URLs I could find... There may be more out there, but I couldn't find reference of them anywhere. If you come across one this doesn't match, please leave a comment with the URL, and I'll try and update the regex to match your URL.

if (preg_match('%(?:youtube(?:-nocookie)?\.com/(?:[^/]+/.+/|(?:v|e(?:mbed)?)/|.*[?&]v=)|youtu\.be/)([^"&?/\s]{11})%i', $url, $match)) {
    $video_id = $match[1];
}

Here is a sample of the URLs this regex matches: (there can be more content after the given URL that will be ignored)

It also works on the youtube-nocookie.com URL with the same above options.

It will also pull the ID from the URL in an embed code (both iframe and object tags)

Benjam
  • 5,285
  • 3
  • 26
  • 36
  • I am using the expression provided above and I always get the ending /iframe> in the video ID. – Jason Yost Jan 01 '12 at 11:24
  • Can you give a link to a pastebin example of what exactly you are doing? Or create a question here on SO and link to it here? – Benjam Jan 02 '12 at 05:06
  • 3
    Again... do you have a code sample? Are you using it correctly? I just tested it with your URL, and it returned an array and in `$match[1]` was `'9ofSV-ATEB0'`, which IS the id. – Benjam Jan 27 '12 at 04:00
  • @Benjam do you have a preg_match for vimeo that is like this one! This is a great regX +1. Thanks! – Juan Gonzales Apr 15 '12 at 17:03
  • 1
    I've got better results for **/?v=** and **iframe/object** variants when moving the username/... check (`[^/]+/.+/`) to the back: `%(?:youtube(?:-nocookie)?\.com/(?:(?:v|e(?:mbed)?)/|.*[?&]v=|[^/]+/.+/)|youtu\.be/)([^"&?/ ]{11})%i`. For other variants it remains the same. – Lode Feb 05 '13 at 12:05
  • Maybe instead `[^"&?/ ]{11}` make this `[a-z0-9-_]{11}`? – Modder Sep 28 '14 at 12:04
  • @Modder, because there are more characters than `[a-z0-9-_]` (which should be `[a-z0-9_-]`, btw) allowed in YouTube IDs. And because I don't have a definitive list of what those characters are, I instead looked for anything that I knew it couldn't be, namely `[^"&?/ ]`. I realize that even the blacklisted character list is incomplete, but, supplied with a valid YouTube URL, it works. – Benjam Sep 29 '14 at 15:11
  • @NimitzE. - Should work out of the box. It's only looking for the `youtube.com` portion, and doesn't care about `www.` or `m.`. – Benjam Jul 28 '15 at 16:33
  • @Benjam, I am new to these, can you please explain me the regular express? what does %i and ? and others for? Please – optimus prime Oct 10 '15 at 11:30
  • Regular Expressions are far too complex for a lesson in a comment field. I recommend reading http://www.regular-expressions.info/ – Benjam Oct 14 '15 at 18:08
11

Better use parse_url and parse_str to parse the URL and query string:

$subject = "http://www.youtube.com/watch?v=z_AbfPXTKms&NR=1";
$url = parse_url($subject);
parse_str($url['query'], $query);
var_dump($query);
Gumbo
  • 643,351
  • 109
  • 780
  • 844
  • @Webbo: `parse_url` returns an array of the URL parts, so the URL path is also in there. You need to do some further case differentiation to what type the URL is. – Gumbo May 29 '10 at 20:36
  • 3
    I would rather use a regex to do it all in one – J.C May 29 '10 at 20:40
9

I had to deal with this for a PHP class i wrote a few weeks ago and ended up with a regex that matches any kind of strings: With or without URL scheme, with or without subdomain, youtube.com URL strings, youtu.be URL strings and dealing with all kind of parameter sorting. You can check it out at GitHub or simply copy and paste the code block below:

/**
 *  Check if input string is a valid YouTube URL
 *  and try to extract the YouTube Video ID from it.
 *  @author  Stephan Schmitz <eyecatchup@gmail.com>
 *  @param   $url   string   The string that shall be checked.
 *  @return  mixed           Returns YouTube Video ID, or (boolean) false.
 */        
function parse_yturl($url) 
{
    $pattern = '#^(?:https?://)?(?:www\.)?(?:youtu\.be/|youtube\.com(?:/embed/|/v/|/watch\?v=|/watch\?.+&v=))([\w-]{11})(?:.+)?$#x';
    preg_match($pattern, $url, $matches);
    return (isset($matches[1])) ? $matches[1] : false;
}

To explain the regex, here's a spilt up version:

/**
 *  Check if input string is a valid YouTube URL
 *  and try to extract the YouTube Video ID from it.
 *  @author  Stephan Schmitz <eyecatchup@gmail.com>
 *  @param   $url   string   The string that shall be checked.
 *  @return  mixed           Returns YouTube Video ID, or (boolean) false.
 */        
function parse_yturl($url) 
{
    $pattern = '#^(?:https?://)?';    # Optional URL scheme. Either http or https.
    $pattern .= '(?:www\.)?';         #  Optional www subdomain.
    $pattern .= '(?:';                #  Group host alternatives:
    $pattern .=   'youtu\.be/';       #    Either youtu.be,
    $pattern .=   '|youtube\.com';    #    or youtube.com
    $pattern .=   '(?:';              #    Group path alternatives:
    $pattern .=     '/embed/';        #      Either /embed/,
    $pattern .=     '|/v/';           #      or /v/,
    $pattern .=     '|/watch\?v=';    #      or /watch?v=,    
    $pattern .=     '|/watch\?.+&v='; #      or /watch?other_param&v=
    $pattern .=   ')';                #    End path alternatives.
    $pattern .= ')';                  #  End host alternatives.
    $pattern .= '([\w-]{11})';        # 11 characters (Length of Youtube video ids).
    $pattern .= '(?:.+)?$#x';         # Optional other ending URL parameters.
    preg_match($pattern, $url, $matches);
    return (isset($matches[1])) ? $matches[1] : false;
}
eyecatchUp
  • 10,032
  • 4
  • 55
  • 65
  • Please don't post your answer multiple times. Instead flag as duplicates or add a comment saying that there's an answer on another question if they're not exact duplicates but it's still relevant. – Flexo May 10 '12 at 07:51
  • 3
    @awoodland: No prob and thanks for pointing me to the possibilty to flag questions as duplicates. – eyecatchUp May 10 '12 at 10:59
  • 1
    Added shorts url too here... <3 function parse_yturl($url) { $pattern = '#^(?:https?://)?(?:www\.)?(?:youtu\.be/|youtube\.com(?:/embed/|/shorts/|/v/|/watch\?v=|/watch\?.+&v=))([\w-]{11})(?:.+)?$#x'; preg_match($pattern, $url, $matches); return (isset($matches[1])) ? $matches[1] : false; } – yjs Aug 27 '22 at 06:31
6

I perfected regex from the leader answer. It also grabs the ID from all of the various URLs, but more correctly.

if (preg_match('%(?:youtube(?:-nocookie)?\.com/(?:[\w\-?&!#=,;]+/[\w\-?&!#=/,;]+/|(?:v|e(?:mbed)?)/|[\w\-?&!#=,;]*[?&]v=)|youtu\.be/)([\w-]{11})(?:[^\w-]|\Z)%i', $url, $match)) {
    $video_id = $match[1];
}

Also, it correctly handles the wrong IDs, which more than 11 characters.

http://www.youtube.com/watch?v=0zM3nApSvMgDw3qlxF

Modder
  • 882
  • 11
  • 21
2

Use

 preg_match("#(?<=v=)[a-zA-Z0-9-]+(?=&)|(?<=[0-9]/)[^&\n]+|(?<=v=)[^&\n]+#", $subject, $matches);
Dogbert
  • 212,659
  • 41
  • 396
  • 397
  • This works but I just tested it with the URL http://www.youtube.com/v/z_AbfPXTKms&hl=en_GB&fs=1& and it fails, could you modify it to work with that format too? – J.C May 29 '10 at 20:33
  • I am accepting this answer as it does answer my original question. I am now working on modifying it to work with the URL http://www.youtube.com/v/z_AbfPXTKms&hl=en_GB&fs=1& – J.C May 31 '10 at 14:28
1

Parse Start parameter for BBcode (https://developers.google.com/youtube/player_parameters#start)

example: [yt]http://www.youtube.com/watch?v=G059ou-7wmo#t=58[/yt]

PHP regex:

'#\[yt\]https?://(?:[0-9A-Z-]+\.)?(?:youtu\.be/|youtube\.com(?:/embed/|/v/|/watch\?v=|/ytscreeningroom\?v=|/feeds/api/videos/|/user\S*[^\w\-\s]|\S*[^\w\-\s]))([\w\-]{11})[?=#&+%\w-]*(t=(\d+))?\[/yt\]#Uim'

replace:

'<iframe id="ytplayer" type="text/html" width="639" height="360" src="http://www.youtube.com/embed/$1?rel=0&vq=hd1080&start=$3" frameborder="0" allowfullscreen></iframe>'
Alan Moore
  • 73,866
  • 12
  • 100
  • 156
Fixer
  • 27
  • 6
1

I didn't see anyone directly address the PHP error, so I'll try to explain.

The reason for the "Unknown modifier '['" error is that you forgot to wrap your regex in delimiters. PHP just takes the first character as a delimiter, so long as it's a non-alphanumeric, non-whitespace ASCII character. So in your regex:

preg_match("(?<=v=)[a-zA-Z0-9-]+(?=&)|(?<=[0-9]/)[^&\n]+|(?<=v=)[^&\n]+", $subject, $matches);

PHP thinks you meant ( as an opening delimiter. It then finds what it thinks is your closing delimiter, the next ) and assumes what follows are pattern modifiers. However it finds that your first pattern modifier, the next character after the first ), is [. [ is obviously not a valid pattern modifier, which is why you get the error that you do.

The solution is to simply wrap your regex in delimiters and make sure any delimiters within the regex that you want to match literally are escaped. I like to use ~ as delimiters, b/c you rarely need to match a literal ~ in a regex.

m4olivei
  • 546
  • 1
  • 4
  • 12
1

You forgot to escape the slash character. So this one should do the job:

preg_match("#(?<=v=)[a-zA-Z0-9-]+(?=&)|(?<=[0-9]\/)[^&\n]+|(?<=v=)[^&\n]+#", $subject, $matches);
  • Not need to escape the slash if at the begin and at the end of regex you use a character other than slash, like `#` – Modder Oct 03 '14 at 11:34
0

this worked for me.

$yout_url='http://www.youtube.com/watch?v=yxYjeNZvICk&blabla=blabla';

$videoid = preg_replace("#[&\?].+$#", "", preg_replace("#http://(?:www\.)?youtu\.?be(?:\.com)?/(embed/|watch\?v=|\?v=|v/|e/|.+/|watch.*v=|)#i", "", $yout_url));
T.Todua
  • 53,146
  • 19
  • 236
  • 237
0

use below code

$url = "" // here is url of youtube video
$pattern = getPatternFromUrl($url); //this will retun video id

function getPatternFromUrl($url)
{
$url = $url.'&';
$pattern = '/v=(.+?)&+/';
preg_match($pattern, $url, $matches);
//echo $matches[1]; die;
return ($matches[1]);
}
xkeshav
  • 53,360
  • 44
  • 177
  • 245