0

Is there any regex formula for Youtube and Vevo videos?

Here are some sample code: Youtube: CevxZvSJLk8 Vevo: USUV71402382

From what I know, the pattern seems to have small letters for a youtube video. How do I detect them properly using regex?

Just to make it clear, here are my questions:

  • What is the right regex pattern for a Youtube code?
  • What is the right regex pattern for Vevo code?

I'm trying to put the codes on the last part of the url: Vevo: http://cache.vevo.com/assets/html/embed.html?video=USUV70904460 Youtube: http://www.youtube.com/embed/P1j-6vRykFs

The code will look like this:

function get_video_url($video_code) {
  if (preg_match_all('<Youtube regex code>',$video_code)) {
    return "http://www.youtube.com/embed/" . $code;
  } elseif (preg_match_all('<Vevo regex code>',$video_code)) {
    return "http://cache.vevo.com/assets/html/embed.html?video=" . $code;
  } else {
    return "Youtube or vevo link anyway.";
  }
}

I just need the regex, though.

Franz Noel
  • 1,820
  • 2
  • 23
  • 50
  • Where are you getting the codes from? Is it possible to just check the domain name? – Οurous Dec 04 '14 at 03:09
  • No. because the user will only post the video code alone. – Franz Noel Dec 04 '14 at 04:02
  • For the vevo, if that is the pattern then its [a-zA-Z]{4}\d{8} – argentum47 Dec 04 '14 at 04:33
  • I don't think that Vevo has a small letter, though. I was thinking of something similar with a credit card number pattern, wherein you can predict "US" as the first 2 letters and the rest of the letters and number follow accordingly. – Franz Noel Dec 04 '14 at 05:32

1 Answers1

0

From everything I can tell, VEVO codes are 12 characters long while youtube videos are 11 characters long.

The oldest vevo video I can find, pulled from their facebook, has a 12 character, all-caps code http://www.vevo.com/watch/lady-gaga/Speechless-(Live-At-The-VEVO-Launch-Event)/USUV70904460

As do the newest.


Here, however, is PHP code if you prefer regex matching.

  if (preg_match("/[^\w-]+/",$video_code)) {
    return "Invalid video code";
  } else if (preg_match("/^[A-Z0-9]+$/",$video_code)) {
    return "http://cache.vevo.com/assets/html/embed.html?video=" . $video_code;
  } else {
    return "http://www.youtube.com/embed/" . $video_code;
  }

After some researching I found this:

Sources.

  1. How to validate youtube video ids?
  2. Regex vevo URL video-id

#2 Provides a wonderful regex that would essentially work like this:

  if (preg_match("/^([A-Z]{2}[A-Z0-9]{3}\d{2}\d{5})$/",$video_code)) {
    return "http://cache.vevo.com/assets/html/embed.html?video=" . $video_code;
  } else if (preg_match("/^[\w-]{11}+$/",$video_code)) {
    return "http://www.youtube.com/embed/" . $video_code;
  }  else {
    return "Invalid video code";
  }
Community
  • 1
  • 1
Regular Jo
  • 5,190
  • 3
  • 25
  • 47
  • Thanks. I've revised the invalid code because Youtube URLs have dashes: preg_match("/[^a-zA-Z0-9\s-]+/",$video_code). Thanks, @cfqueryparam. – Franz Noel Dec 04 '14 at 18:19
  • @FranzNoel Alright, I've updated my answer so that it as well. May I ask why you also include a space in the code? Wouldn't it be better to trim surrounding whitespace rather than let it pass a check? `$video_code = trim($video_code);`? Also I found a helpful couple of posts and assembled a better bit of code that, short of petitioning the services themselves, is probably the best you could get. – Regular Jo Dec 04 '14 at 19:42
  • I didn't know the regex refers to a space. I thought it refers to a dash. Thanks, appreciate it, and I think preg_match_all is more relevant than using preg_match because it checks the whole string of the video code. – Franz Noel Dec 04 '14 at 21:58
  • 1
    @FranzNoel Actually, as I understand it, `preg_match()` finds the first occurrence and `preg_match_all()` grabs every occurrence. (I am not a php programmer at all, but this is what I have read) If you were parsing multiple video codes out of a variable, you would then use `_all()` As to matching the whole string, mine does that because it begins with `^` and ends with `$`, the respective regex values for start and end of line. (`^` as the first character within a character class has a different meaning--to exclude the characters that follow it). – Regular Jo Dec 04 '14 at 22:49