6

I'm trying to download captions from some videos on Youtube using their nuget package. Here's some code:

var request = _youtube.Search.List("snippet,id");
request.Q = "Bill Gates";
request.MaxResults = 50;
request.Type = "video";
var results = request.Execute();
foreach (var result in results.Items)
{
    var captionListRequest = _youtube.Captions.List("id,snippet", result.Id.VideoId);
    var captionListResponse = captionListRequest.Execute();
    var russianCaptions =
        captionListResponse.Items.FirstOrDefault(c => c.Snippet.Language.ToLower() == "ru");
    if (russianCaptions != null)
    {
        var downloadRequest = _youtube.Captions.Download(russianCaptions.Id);
        downloadRequest.Tfmt = CaptionsResource.DownloadRequest.TfmtEnum.Srt;
        var ms = new MemoryStream();
        downloadRequest.Download(ms);
    }
}

When the Download method is called I'm getting a weird Newtonsoft.JSON Exception that says:

    Newtonsoft.Json.JsonReaderException: 'Unexpected character encountered while parsing value: T. Path '', line 0, position 0.'
   at Newtonsoft.Json.JsonTextReader.ParseValue()

I've read some other threads on captions downloading problems and have tried to change my authorization workflow: first I've tried to use just the ApiKey but then also tried OAuth. Here's how it looks now:

 var credential = GoogleWebAuthorizationBroker.AuthorizeAsync(
    new ClientSecrets
    {
        ClientId = "CLIENT_ID",
        ClientSecret = "CLIENT_SECRET"
    },
    new[] { YouTubeService.Scope.YoutubeForceSsl },
    "user",
    CancellationToken.None,
    new FileDataStore("Youtube.CaptionsCrawler")).Result;

_youtube = new YouTubeService(new BaseClientService.Initializer
{
    ApplicationName = "LKS Captions downloader",
    HttpClientInitializer = credential
});

So, is it even possible to do what I'm trying to achieve?

P.S. I was able to dig deep into the youtube nuget package and as I see, the actual message, that I get (that Newtonsoft.JSON is trying to deserialize, huh!) is "The permissions associated with the request are not sufficient to download the caption track. The request might not be properly authorized, or the video order might not have enabled third-party contributions for this caption."

So, do I have to be the video owner to download captions? But if so, how do other programs like Google2SRT work?

Daniel Vygolov
  • 884
  • 2
  • 13
  • 26
  • Based from this [thread](https://stackoverflow.com/questions/32362006/jsonreaderexception-unexpected-character-encountered), be noted that `JObject.Parse()` expects the actual JSON content (string), not a path. Also, converting JSON string to `JObject` and then back `ToString()` is really not adding any "value" here. You may also check these links: [1](https://stackoverflow.com/questions/23259173/unexpected-character-encountered-while-parsing-value) and [2](https://stackoverflow.com/questions/38263368/c-sharp-api-unexpected-character-encountered-while-parsing-value-s-path-l). – abielita Sep 17 '17 at 18:42
  • This Parse method is called INSIDE google's nuget package, that's why we can't do anything about it. But the real problem is that the captions are not downloaded properly. – Daniel Vygolov Sep 17 '17 at 18:56
  • my problem is getting the CC into one string. – Amir Hajiha Nov 19 '18 at 20:34

1 Answers1

3

Found this post How to get "transcript" in youtube-api v3

You can get them via GET request on: http://video.google.com/timedtext?lang={LANG}&v={VIDEOID}

Example: http://video.google.com/timedtext?lang=en&v=-osCkzoL53U

Note that they should have subtitles added, will not work if auto-generated.

Janis S.
  • 2,526
  • 22
  • 32