0

I using YouTube Data API to download captions and it seems to work but when I download the subtitles from api the content is not in the correct format.

1
00:00:00,719 --> 00:00:06,000
{LONG TEXT CONTAINS ALL SCRIPT}

2
00:00:03,240 --> 00:00:09,120

3
00:00:06,000 --> 00:00:11,219
[REST OF THE TIME WITH EMPTY LINES]

However, it looks correct when I download using studio.youtube.com.

Is there a fix for that?

PS: Google says I must use stackoverflow with tags but so far I cannot get any answers or comments. Is it now correct approach?

EDIT:

Based on the API document I am using below code and it downloads the subtitle but not with correctly formated data.

YouTubeService youtubeService = await GetYouTubeService();

// Get the list of available captions for the video
var captionListRequest = youtubeService.Captions.List("snippet, id", book.BSYouTubeId);

var captionListResponse = await captionListRequest.ExecuteAsync();

// Get the first caption track
var captionTrack = captionListResponse.Items.FirstOrDefault();
if (captionTrack == null)
{
    context.WriteLine("No caption track found.");
    return;
}

// Download the caption track
var captionDownloadRequest = youtubeService.Captions.Download(captionTrack.Id);
captionDownloadRequest.Tfmt = "srt";
captionDownloadRequest.Tlang = book.GetShortLanguageCode();
var captionStream = await captionDownloadRequest.ExecuteAsStreamAsync();

// Read the caption track into a string
var captionString = new StreamReader(captionStream).ReadToEnd();
Onur Topal
  • 3,042
  • 1
  • 24
  • 41

1 Answers1

0

As it seems to be a YouTube Data API v3 issue, I recommend you to use yt-dlp by using:

yt-dlp --write-sub --sub-lang all,-live_chat 'VIDEO_ID'
Benjamin Loison
  • 3,782
  • 4
  • 16
  • 33