2

I'm trying to generate timestamps using Azure S2T in C#. I've tried the following resources:

How to get Word Level Timestamps using Azure Speech to Text and the Python SDK?

How to generate timestamps in speech recognition?

The second has been the most helpful, but I'm still getting errors. My code is:

using System;
using System.Threading.Tasks;
using Microsoft.CognitiveServices.Speech;
using Microsoft.CognitiveServices.Speech.Audio;

namespace NEST
{
internal class NewBaseType
{
    static async Task Main(string[] args)

    {
        // Creates an instance of a speech config with specified subscription key and region.
        // Replace with your own subscription key and service region (e.g., "westus").
        var config = SpeechConfig.FromSubscription("subscriptionkey", "region");

        // Generates timestamps
        config.OutputFormat = OutputFormat.Detailed;
        config.RequestWordLevelTimestamps = true;   

        //calls the audio file
        using (var audioInput = AudioConfig.FromWavFileInput("C:/Users/MichaelSchwartz/source/repos/AI-102-Process-Speech-master/transcribe_speech_to_text/media/Zoom_audio.wav"))

        // Creates a speech recognizer from microphone.
        using (var recognizer = new SpeechRecognizer(config, audioInput))
        {
            // Subscribes to events.
            recognizer.Recognizing += (s, e) =>
            {
                Console.WriteLine($"RECOGNIZING: Text={e.Result.Text}");
            };

            recognizer.Recognized += (s, e) =>
            {
                var result = e.Result;
                Console.WriteLine($"Reason: {result.Reason.ToString()}");
                if (result.Reason == ResultReason.RecognizedSpeech)
                {
                    Console.WriteLine($"Final result: Text: {result.Text}.");
                }
            };

            recognizer.Canceled += (s, e) =>
            {
                Console.WriteLine($"\n    Canceled. Reason: {e.Reason.ToString()}, CanceledReason: {e.Reason}");
            };

            recognizer.SessionStarted += (s, e) =>
            {
                Console.WriteLine("\n    Session started event.");
            };

            recognizer.SessionStopped += (s, e) =>
            {
                Console.WriteLine("\n    Session stopped event.");
            };

            // Starts continuous recognition. 
            // Uses StopContinuousRecognitionAsync() to stop recognition.
            await recognizer.StartContinuousRecognitionAsync().ConfigureAwait(false);

            do
            {
                Console.WriteLine("Press Enter to stop");
            } while (Console.ReadKey().Key != ConsoleKey.Enter);

            var json = result.Properties.GetProperty(PropertyId.SpeechServiceResponse_JsonResult);
            Console.WriteLine(json);

            // Stops recognition.
            await recognizer.StopContinuousRecognitionAsync().ConfigureAwait(false);
        }
    }
}

}

The errors returned are:

Cannot assign to 'RequestWordLevelTimestamps' because it is a 'method group' [NEST]

The name 'result' does not exist in the current context [NEST]

How do I resolve these errors?

1 Answers1

2

You should use

config.RequestWordLevelTimestamps()

instead of

config.RequestWordLevelTimestamps = true;

RequestWordLevelTimestamps is a method. Reference to the method.

Satya V
  • 3,811
  • 1
  • 6
  • 9
  • 1
    The second error indicates you have received any result. But I think it should resolve on fixing the above code – Satya V Mar 31 '21 at 05:26
  • Thanks! The first error is resolved, but the second error remains. – Michael Schwartz Mar 31 '21 at 14:04
  • I was able to resolve the second error (maybe) by following the info found here: https://stackoverflow.com/questions/60229786/how-to-enable-word-level-confidence-for-ms-azure-speech-to-text-service However, the code runs but doesn't produce timestamps. How do I get the transcript with timestamps? – Michael Schwartz Mar 31 '21 at 16:37
  • What is your updated code ? Also what are you currently seeing ? – Satya V Apr 01 '21 at 06:41
  • My updated code is here: https://stackoverflow.com/questions/66894188/how-do-i-get-timestamps-to-generate-in-azure-speech-to-text-model/66897949#66897949 I see: Session started event. Press Enter to stop The speech Translation API transcribes audio streams into text. Your application can display this text to the user or act upon it as command input. You can use this API either with an SDK client library, or a rest a rest API. Session stopped event. The code does not return timestamp info but I've included var json = result. The error is mine. Suggestions? – Michael Schwartz Apr 02 '21 at 14:37