Questions tagged [microsoft-speech-api]

The Microsoft Speech API (SAPI) provides a high-level interface between an application and speech engines. SAPI implements all the low-level details needed to control and manage the real-time operations of various speech engines.

The Microsoft Speech API (SAPI) provides a high-level interface between an application and speech engines. SAPI implements all the low-level details needed to control and manage the real-time operations of various speech engines.

The two basic types of SAPI engines are text-to-speech (TTS) systems and speech recognizers. TTS systems synthesize text strings and files into spoken audio using synthetic voices. Speech recognizers convert human spoken audio into readable text strings and files.

API for Text-to-Speech

Applications can control text-to-speech (TTS) using the ISpVoice Component Object Model (COM) interface. Once an application has created an ISpVoice object (see Text-to-Speech Tutorial), the application only needs to call ISpVoice::Speak to generate speech output from some text data.

In addition, the IspVoice interface also provides several methods for changing voice and synthesis properties such as speaking rate ISpVoice::SetRate, output volume ISpVoice::SetVolume and changing the current speaking voice ISpVoice::SetVoice.

API for Speech Recognition

Just as ISpVoice is the main interface for speech synthesis, ISpRecoContext is the main interface for speech recognition. Like the ISpVoice, it is an ISpEventSource, which means that it is the speech application's vehicle for receiving notifications for the requested speech recognition events.

Source:http://msdn.microsoft.com/en-us/library/ee125077(v=vs.85).aspx

82 questions
10
votes
1 answer

Difference among Microsoft Speech products/platforms

It seems Microsoft offers quite a few speech recognition products, I'd like to know the differences among all of them pls. There is Microsoft Speech API, or SAPI. But somehow Microsoft Cognitive Service Speech API has the same name. Ok now,…
7
votes
1 answer

Batch transcription with Microsoft Azure (REST API)

I want transcribe longer audio files (at least 5 minutes) using REST APIs from Microsoft. There are a lot of different products and names, e.g. Speech service API or Bing Speech API. None of the REST APIs I tried so far supports transcribing longer…
7
votes
2 answers

Locking the computer disables speech recognition on windows 8.1

I work with SpeechRecognitionEngine from the namespace System.Speech in inproc-mode for doing some automation work. The speech recognition is started via RecognizeAsync. It works fine, however, when the computer gets locked, speech recognition…
6
votes
1 answer

Error loading Microsoft Speech SDK v11

I have installed the x86 SDK and added the Microsoft.Speech.dll to my project. The project is set to x86. When trying to create a instance of SpeechSynthesizer I get Retrieving the COM class factory for component with CLSID …
Anders
  • 17,306
  • 10
  • 76
  • 144
5
votes
1 answer

Can't pip microsoft azure-cognitiveservices-speech?

Following the guide here to install the microsoft azure text to speech SDK: https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/quickstart-python#install-the-speech-sdk It says to run pip install…
5
votes
1 answer

How to get pronunciation phonemes corresponding to a word using C#?

I'll preface this by saying I'm very novice when it comes to C# programming. I'm working on an application for programmatically modifying the Windows Speech Dictionary using C# in conjunction with SAPI v5.4 (speechlib). Everything is working well so…
Exergist
  • 157
  • 12
5
votes
3 answers

get working Microsoft Speech API with Angular

hi im trying to find a way to get working Angular 5 with Microsoft Speech API i used microsoft-speech-browser-sdk for javascript https://github.com/Azure-Samples/SpeechToText-WebSockets-Javascript i just import the SDK import * as SDK from…
Adamo Figueroa
  • 330
  • 2
  • 14
4
votes
2 answers

Speech-to-text large audio files [Microsoft Speech API]

What is the best way to transcribe medium/large audio files, ~ 6-10 mins each file, using Microsoft Speech API? Something like batch audio files transcription? I have used the code provided in…
4
votes
1 answer

How to use bing speech API in app?

I've never used Bing speech API before, so, I have many question about bing speech API. If I want to make Android App using Bing speech API, should I subscribe bing speech API at Azure? And Should I sign up for LUIS? And I want to know difference…
4
votes
1 answer

Microsoft Speech Recognition: Alternate results with confidence score?

I'm new to working with the Microsoft.Speech recognizer (using Microsoft Speech Platform SDK Version 11) and I'm trying to have it output the n-best recognition matches from a simple grammar, along with the confidence score for each. According to…
3
votes
1 answer

Calling SpeechAPI for text to speech on Azure

I have the following very basic TTS code running on my local server using System.Speech.Synthesis; ... SpeechSynthesizer reader = new SpeechSynthesizer(); reader.Speak("This is a test"); This code has a dependency on System.Speech for which I have…
MayoMan
  • 4,757
  • 10
  • 53
  • 85
2
votes
2 answers

Azure Cognitive Services Speech to Text large/long audio files sample

Like to transcribe a couple of long (Dutch) audio files. They are interviews which are about 60-120 minutes per file in length. Got only 8 files which I need to do manually, so not necessarily part of some automated software. Got some Azure credits,…
2
votes
1 answer

What is the difference between System.Speech.Synthesis and Microsoft.Speech.Synthesis?

I am currently developing a small program in C# implementing Text-To-Speech. However, I found out that there are two namespaces which can be used: System.Speech.Synthesis Microsoft.Speech.Synthesis I googled for the differences and found this post…
2
votes
1 answer

Integrating Azure Bot with Azure Speech Services

Is there a possible way to integrate the speech services with bot? So i would like to know what is the process that goes in the integration of Speech Services with Bot? How is it possible to do the integration with bot through the key that is…
2
votes
1 answer

Python Microsoft Speech API Error: SPERR_NO_DRIVER from CmdLoadFromFile

This question may well go unanswered, but I would dearly like some help on the matter. I found a snippet of code for dealing with Microsoft's Speech API in Python, and then went and learned about W3C's "Speech Recognition Grammar Specification…
skeggse
  • 6,103
  • 11
  • 57
  • 81
1
2 3 4 5 6