2

The following code is used to get Youtube video ids in order to get a thumbnail image.

What is the reasoning behind the first regular expression and what is it doing exactly? It appears to be returning at least two results. Also, could the two be combined?

else if(url.match("youtube.com/")){

    var vid;
    var results;

    //http://www.youtube.com/watch?v=GItD10Joaa0
    results = url.match("[\\?&]v=([^&#]*)");

    vid = ( results === null ) ? url : results[1];

    return "http://img.youtube.com/vi/"+vid+"/2.jpg";
} else if( url.match("youtu.be/") ) {

    var vid;
    var results;

    // http://youtu.be/5uxd-521uus?hd=1
    // results = url.match("[^http://youtu.be/](.*)[^?hd=1]");
    // Corrected
    results = url.match(""^http://youtu.be/(.*)(?=hd=1)");

    //alert(results[0]);
    vid = ( results === null ) ? url : results[0];

    return "http://img.youtube.com/vi/"+vid+"/2.jpg";
}
James P.
  • 19,313
  • 27
  • 97
  • 155

4 Answers4

5
"[\\?&]v=([^&#]*)"

explained (after reduction from a JavaScript string to a regex):

[\?&]   # Match a ? or a & (the backslash is unnecessary here!)
v=      # Match the literal text "v="
(       # Capture the following into backreference no. 1:
 [^&#]* # Zero or more characters except & or #
)       # End of capturing group.

The second regex [^http://youtu.be/](.*)[^?hd=1] is very wrong.

It probably should read

"^http://youtu.be/(.*)(?=hd=1)"
Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
  • @alex: Of course, I just slapped my forehead and corrected it. – Tim Pietzcker Apr 19 '11 at 10:20
  • Wouldn't two backslashes match a literal backslash? I don't know why it is there in the regex though (I deleted original comment because SO made a mess of it :P). Do they expect `youtube.com\?v=abc` ? Weird. – alex Apr 19 '11 at 10:22
  • @alex: I don't think so - he's using a string, not a regex object to construct the regex; therefore the two backslashes are reduced to one by the string processor, and the regex engine gets only one. – Tim Pietzcker Apr 19 '11 at 10:24
  • Thanks for the replies. As you can see I haven't got the hang of regular expressions yet. Should the backslash be doubled Tim? – James P. Apr 19 '11 at 10:30
  • In your case, you don't need it at all. – Tim Pietzcker Apr 19 '11 at 10:32
  • 1
    No, no: The question mark must not be escaped. It is part of a [lookahead assertion](http://www.regular-expressions.info/lookaround.html). I suggest you look at a regex tutorial to get ahold of some of the basics. – Tim Pietzcker Apr 19 '11 at 11:19
1

If you are referring to...

results = url.match("[\\?&]v=([^&#]*)");

Then it is matching a literal \, ? or & followed by literal v= followed by a capturing group which is capturing 0 or more of any characters that are not & or #.

alex
  • 479,566
  • 201
  • 878
  • 984
1

The 1st regex is checking for "?v=GItD10Joaa0" when the url is something like "youtube.com/" and the 2nd is checking for "www.youtube.com/index?feature=youtu.be" when the url is "http://www.youtube.com/index?feature=youtu.be"

So you can simply use the 1st regex if you want to get ids from 1st url and likewise :)

AabinGunz
  • 12,109
  • 54
  • 146
  • 218
1

Ok, I did some fishing around and came accross this regex. It should suit the purpose described above.

youtu(?:\.be|be\.com)/(?:.*v(?:/|=)|(?:.*/)?)([a-zA-Z0-9-_]+)

From: C# regex to get video id from youtube and vimeo by url

And: http://forrst.com/posts/Automatic_YouTube_Thumbnails_with_PHP_and_Regex-uta

Community
  • 1
  • 1
James P.
  • 19,313
  • 27
  • 97
  • 155