If anyone wants to know this today, you can get a ton of information about a video from it's player
. YouTube's undocumented youtubei
API has multiple libraries in languages like JavaScript, Python and even Rust trying to tame it. (I'm writing a replacement for the broken Rust one). If you don't want to use any of these, or there isn't one for your language and this information is still valid:
Request
You can make a POST request to https://www.youtube.com/youtubei/v1/player?key=AIzaSyAO_FJ2SlqU8Q4STEHLGCilw_Y9_11qcW8&prettyPrint=false
(The key is the one that the YouTube web client uses) with the following HTTP headers:
Accept-Language
: en-US,en;q=0.5
(You can obviously change the language)
Content-Type
: application/json
X-Youtube-Client-Name
: 1
(To pretend to be the web client)
X-Youtube-Client-Version
: 2.20230607.06.00
Sec-Fetch-Mode
: no-cors
Then set the user agent to something that looks like a browser (juz grab it from ya browza), I don't know if they check it, but ♂️ just in case (you know).
In terms of the request JSON, it should look like this:
{
"context": {
"hl": "en",
"clientName": "WEB",
"clientVersion": "2.20230607.06.00",
},
"videoId": "{video_id}",
"params": "" // These are a little odd, you won't really have any of these so leave it blank
}
No, don't actually put that comment in there! That's for your education.
Response
There's a ton of useful information in this response, but we're looking for captions. Let's call the root of the response response
. We find captions, as of June 2023, response
⇾captions
⇾playerCaptionsTracklistRenderer
⇾captionTracks
(if captions
doesn't exist, it's because captions don't exist for the video). This captionTracks
is an array of objects that look like this:
{
"baseUrl": "https://www.youtube.com/api/timedtext?v=c0td7Noukww&caps=asr&opi=112496729&xoaf=5&hl=en&ip=0.0.0.0&ipbits=0&expire=1687045036&sparams=ip,ipbits,expire,v,caps,opi,xoaf&signature=35A403189649A24C75C8CE6CB6016B46D9385CC4.1F3E3B7FF4670E84747F5C24DE2B119B04BA9F47&key=yt8&kind=asr&lang=en",
"name": {
"simpleText": "English (auto-generated)"
},
"vssId": "a.en",
"languageCode": "en",
"kind": "asr",
"isTranslatable": true
}
If you make a GET request to this baseUrl
, you'll get in response HTML encoded text captions. By appending &fmt=vtt
You'll get WebVTT captions. That means time data, so we can have real subtitles and even convert to SRT for usage in video players if we download the video.