C# - Free speech recognition Engine library (SDK)
System.Speech.Recognition is very bad... I want another SDK that give me good results and works with c# on Visual Studio...
and i want it OFFLINE not online like google api
Thanks
C# - Free speech recognition Engine library (SDK)
System.Speech.Recognition is very bad... I want another SDK that give me good results and works with c# on Visual Studio...
and i want it OFFLINE not online like google api
Thanks
I got quite good results using pocketsphinx, or Sphinx if you have more available resources, in the past. Check it here: https://cmusphinx.github.io/
When you choose to implement a speech recognition system, even if you build it from ground up, you have to take into consideration the following aspects:
If the speech recognition engine is offline, this means that the computational load must be supported by your local machine. This has the advantage of allowing you to make the whole system independent as a whole from any infrastructure besides the machine on which is running and the operating system. The disadvantage of this approach is the fact that if the language model is large, your RAM memory, GPU, and/or CPU will be solicited.
If the speech recognition engine is offline, the computational load will be supported by the host machines which facilitate the speech recognition's system infrastructure. This has the advantage of keeping the RAM, CPU, and/or GPU load on a minimum and also allowing both high end and low end devices to interact with the speech recognition engine and/or the application who implements it. The disadvantage is the fact that the system that implements the speech recognition engine us dependent on the infrastructure on the remote machines that host the speech recognition engine, thus any downtime of those servers will make the implementation of the speech recognition engine on all devices inoperable.
From your question I can see that you are unhappy with the performance of the System.Speech.Recognition
library, and you said that you would want something offline. The offline speech recognition engines available that have a high degree of accuracy consume a lot of resources, because they have a large language model in order to offer this accuracy. In C# you have a couple of offline speech recognition engines: Vosk
and Whispercpp
(The implementation of Whisper
in c++
). Another option is to use the official Whisper implementation, which is in Python, run it in a python script, and make the script communicate with the C# application. These are high quality offline speech recognition engines, and the degree of accuracy will increase with the size of the model used. But as I said earlier, a higher degree of accuracy requires greater computational power, and in order for these "not to suck", you'll need high performance hardware in order to run the larger models easily.
[ Vosk ]
Project's GitHub page: https://github.com/alphacep/vosk-api
Speech recognition engine models: https://alphacephei.com/vosk/models
[ Whisper.cpp ]
Project's GitHub page: https://github.com/ggerganov/whisper.cpp
Whisper.cpp C# Api GitHub page: https://github.com/Const-me/Whisper
[ Whisper ]
Project's GitHub page: https://github.com/openai/whisper
[ CONCLUSION ]
My recommendation will be to for your implementation to use an online speech recognition engine. If the application will run only on Windows check this: https://stackoverflow.com/a/70041524/16587692. For an implementation of this, check my application: https://sourceforge.net/projects/eva-ai/. For the source code of my application check this: https://github.com/CSharpTeoMan911/Eva.
If the application has to run on multiple platforms check:
[ Whisper API ]
Whiper online speech recognition engine: https://platform.openai.com/docs/api-reference/introduction
[ Google Speech-To-Text API ]
https://codelabs.developers.google.com/codelabs/cloud-speech-text-csharp#0