Extract YouTube ID with or without RegEx

Question

Please let me know how to get youtube ID without going to regular expression?

Using above method following URL, didn't work

http://www.youtube.com/e/dQw4w9WgXcQ

http://www.youtube.com/watch?feature=player_embedded&v=dQw4w9WgXcQ

public static String extractYTId(String youtubeUrl) {
    String video_id = "";

    try {
        if(youtubeUrl != null && youtubeUrl.trim().length() > 0 && youtubeUrl.startsWith("http")) {
            String expression = "^.*((youtu.be" + "\\/)" + "|(v\\/)|(\\/u\\/w\\/)|(embed\\/)|(watch\\?))\\??v?=?([^#\\&\\?]*).*"; // var regExp = /^.*((youtu.be\/)|(v\/)|(\/u\/\w\/)|(embed\/)|(watch\?))\??v?=?([^#\&\?]*).*/;
            //String expression = "^.*(?:youtu.be\\/|v\\/|e\\/|u\\/\\w+\\/|embed\\/|v=)([^#\\&\\?]*).*";
            CharSequence input = youtubeUrl;
            Pattern pattern = Pattern.compile(expression, Pattern.CASE_INSENSITIVE);
            Matcher matcher = pattern.matcher(input);
            if(matcher.matches()) {
                String groupIndex1 = matcher.group(7);
                if(groupIndex1 != null && groupIndex1.length() == 11)
                    video_id = groupIndex1;
            }
        }
    } catch(Exception e) {
        Log.e("YoutubeActivity", "extractYTId " + e.getMessage());
    }

    return video_id;
}

Other links working fine

http://www.youtube.com/v/0zM3nApSvMg?fs=1&hl=en_US&rel=0

http://www.youtube.com/embed/0zM3nApSvMg?rel=0

http://www.youtube.com/watch?v=0zM3nApSvMg&feature=feedrec_grec_index

http://www.youtube.com/watch?v=0zM3nApSvMg

http://youtu.be/0zM3nApSvMg

http://www.youtube.com/watch?v=0zM3nApSvMg#t=0m10s

http://youtu.be/dQw4w9WgXcQ

http://www.youtube.com/embed/dQw4w9WgXcQ

http://www.youtube.com/v/dQw4w9WgXcQ

http://www.youtube.com/watch?v=dQw4w9WgXcQ

http://www.youtube-nocookie.com/v/6L3ZvIMwZFM?version=3&hl=en_US&rel=0

score 4 · Answer 1 · edited Apr 06 '16 at 15:54

You can use following RegEx

^(?:(?:https?:\/\/)?(?:www\.)?)?(youtube(?:-nocookie)?\.com|youtu\.be)\/.*?(?:embed|e|v|watch\?.*?v=)?\/?([a-z0-9]+)

RegEx Breakup:

^: Start of the line anchor
(?:(?:https?:\/\/)?(?:www\.)?)?:
- (?:https?:\/\/)?: Match http:// or https:// optionally
- (?:www\.)?)?: Match www. zero or one time
(youtube(?:-nocookie)?\.com|youtu\.be)\/: Match either
- youtube.com or youtube-nocookie.com or youtu.be followed by /
.*?: Lazy match. Match until the next pattern satisfies.
(?:embed|e|v|watch\?.*?v=)?\/?:
- (?:embed|e|v|watch\?.*?v=)?: Match embed or e or v or from watch? to v= or nothing
- \/?: Match / zero or one time
([a-z0-9]+): Match one or more alphanumeric characters and add that in the captured group.

Live Demo^{Using JavaScript}

var regex = /^(?:(?:https?:\/\/)?(?:www\.)?)?(youtube(?:-nocookie)?\.com|youtu\.be)\/.*?(?:embed|e|v|watch\?.*?v=)?\/?([a-z0-9]+)/i;

// An array of all the youtube URLs
var youtubeLinks = [
    'http://www.youtube.com/e/dQw4w9WgXcQ',
    'http://www.youtube.com/watch?feature=player_embedded&v=dQw4w9WgXcQ',
    'http://www.youtube.com/v/0zM3nApSvMg?fs=1&hl=en_US&rel=0',
    'http://www.youtube.com/embed/0zM3nApSvMg?rel=0',
    'http://www.youtube.com/watch?v=0zM3nApSvMg&feature=feedrec_grec_index',
    'http://www.youtube.com/watch?v=0zM3nApSvMg',
    'http://youtu.be/0zM3nApSvMg',
    'http://www.youtube.com/watch?v=0zM3nApSvMg#t=0m10s',
    'http://youtu.be/dQw4w9WgXcQ',
    'http://www.youtube.com/embed/dQw4w9WgXcQ',
    'http://www.youtube.com/v/dQw4w9WgXcQ',
    'http://www.youtube.com/watch?v=dQw4w9WgXcQ',
    'http://www.youtube-nocookie.com/v/6L3ZvIMwZFM?version=3&hl=en_US&rel=0'
];

// An object to store the results
var youtubeIds = {};

// Iterate over the youtube URLs
youtubeLinks.forEach(function(url) {
    // Get the value of second captured group to extract youtube ID
    var id = "<span class='youtubeId'>" + (url.match(regex) || [0, 0, 'No ID present'])[2] + "</span>";

    // Add the URL and the extracted ID in the result object
    youtubeIds[url] = id;
});

// Log the object in the browser console
console.log(youtubeIds);

// To show the result on the page
document.getElementById('output').innerHTML = JSON.stringify(youtubeIds, 0, 4);

.youtubeId {
    color: green;
    font-weight: bold;
}

<pre id="output"></pre>

@Piraba I think you need to double the backslashes when adding the regex as string. — Tushar, Feb 17 '16 at 08:19
I added backslash `String expression = "/^(?:(?:https?:\\/\\/)?(?:www\\.)?)?(youtube(?:-nocookie)?\\.com|youtu\\.be)\\/.*?(?:embed|e|v|watch\\?.*?v=)?\\/?([a-z0-9]+)/i";`. Not working — Piraba, Feb 17 '16 at 08:31
@Piraba You need to use `if(matcher.find())` instead of `if(matcher.matches())` and print the group 2. Sample: `if (matcher.find()) { video_id = matcher.group(2); }` — Tunaki, Feb 17 '16 at 14:40
@Tushar - That "flow-illustration" at the bottom looks generated.. how, where ?!? — T4NK3R, Feb 27 '16 at 19:44
@T4NK3R [regexper.com](http://regexper.com/#%2F%5E(%3F%3A(%3F%3Ahttps%3F%3A%5C%2F%5C%2F)%3F(%3F%3Awww%5C.)%3F)%3F(youtube(%3F%3A-nocookie)%3F%5C.com%7Cyoutu%5C.be)%5C%2F.*%3F(%3F%3Aembed%7Ce%7Cv%7Cwatch%5C%3F.*%3Fv%3D)%3F%5C%2F%3F(%5Ba-z0-9%5D%2B)%2Fi) is an example. The above is generated by Atom editor with `regex-railroad-diagram` package. — Tushar, Feb 28 '16 at 05:24
The drawback with this answer is if an unknown url pattern arises the method won't be able to extract the videoId. The potential impact may be adressed with the following method: http://stackoverflow.com/a/39742707/363573. — Stephan, Sep 28 '16 at 09:08
I found that this solution worked when other url parameters like time_continue were included in the string that the other RegExp's in this thread didn't catch. — kylegill, Apr 03 '18 at 20:05

Kirill Gamazkov · Answer 2 · 2017-09-11T17:27:10.883

1

Your regex is designed for youtu.be domain, of course it doesn't work with youtube.com one.

Construct java.net.URL (https://docs.oracle.com/javase/7/docs/api/java/net/URL.html) from your URL string
Use URL#getQuery() to get the query part
Check Parse a URI String into Name-Value Collection for a ways to decode query part into a name-value map, and get value for name 'v'
If there is no 'query' part (like in http://www.youtube.com/e/dQw4w9WgXcQ), then use URL#getPath() (which will give you /e/dQw4w9WgXcQ) and parse your video ID from it, e. g., by skipping first 3 symbols: url.getPath().substring(3)

Update. Why not regex? Because standard JDK URL parser is much more robust. It is being tested by the whole Java community, while RegExp-based reinvented wheel is only tested by your own code.

edited Sep 11 '17 at 17:27

answered Feb 16 '16 at 15:01

Kirill Gamazkov

3,277
1
18
22

This is not an answer for his question. – XsiSecOfficial Feb 16 '16 at 15:06
He asked for a way to get video id from string (with or without regex), I've suggested one. Why don't you think this is an answer? – Kirill Gamazkov Feb 16 '16 at 15:13
actually - this is a very elegant solutions since it realizes that all URLs can be marshaled by analyzing or a query param, or a path param. Works perfectly for me. – Mardann Sep 11 '17 at 12:11

score 0 · Answer 3 · edited Apr 03 '18 at 21:17

I like to use this function for all YouTube video ids. I pass through the url and return only the id. Check the fiddle below.

 var ytSrc = function( url ){
    var regExp = /^.*((youtu.be\/)|(v\/)|(\/u\/\w\/)|(embed\/)|(watch\?))\??v?=?([^#\&\?]*).*/;
    var match = url.match(regExp);
    if (match&&match[7].length==11){
        return match[7];
    }else{
     alert("Url incorrecta");
    }

}

https://jsfiddle.net/keinchy/tL4thwd7/1/

Extract YouTube ID with or without RegEx

3 Answers3

Linked