I have to create a translated audio version for a YouTube video so I'm using YoutubeExplode
to download the audio file:
var youtubeClient = new YoutubeClient();
var video = await youtubeClient.Videos.GetAsync(videoUrl);
var streamManifest = await youtubeClient.Videos.Streams.GetManifestAsync(video.Id);
var audioStreamInfo = streamManifest.GetAudioOnlyStreams().GetWithHighestBitrate();
var stream = await youtubeClient.Videos.Streams.GetAsync(audioStreamInfo);
Then I've created a Speech Azure Cognitive Service
to generate the translated audio file and here is my code:
var speechTranslateConfig = SpeechTranslationConfig.FromSubscription("key", "region");
var text = await SpeechToText(speechTranslateConfig, stream);
async Task<string> SpeechToText(SpeechTranslationConfig config, Stream stream)
{
config.SpeechRecognitionLanguage = "en-US";
config.AddTargetLanguage("ro");
using var audioInputStream = AudioInputStream.CreatePushStream();
using var audioConfig = AudioConfig.FromStreamInput(audioInputStream);
using var recognizer = new TranslationRecognizer(config, audioConfig);
var bytes = streamToByteArray(stream);
audioInputStream.Write(bytes);
var result = await recognizer.RecognizeOnceAsync();
return result.Text;
}
private static byte[] streamToByteArray(Stream input)
{
MemoryStream ms = new MemoryStream();
input.CopyTo(ms);
return ms.ToArray();
}
I'm trying to use Stream
because I don't want to save the original audio file, but the impediment I'm facing is that the translation result is always an empty string.
I also tried to save the original file and translate it (instead of converting the stream to a byte array) and like this, all works fine.
I can't understand what I'm missing, because I followed the documentation.