What is the difference between System.Speech.Synthesis and Microsoft.Speech.Synthesis?

Question

I am currently developing a small program in C# implementing Text-To-Speech. However, I found out that there are two namespaces which can be used:

System.Speech.Synthesis
Microsoft.Speech.Synthesis

I googled for the differences and found this post which is about speech recognition. It doesn't really answer my question. I also switched between the two of them and there was no difference. It worked with all the languages in the code (below).

using System;
using System.Speech.Synthesis;
//using Microsoft.Speech.Synthesis;

namespace TTS_TEST
{
class Program
{

    static void Main(string[] args)
    {
          SpeechSynthesizer synth = new SpeechSynthesizer();

          int num;
          string userChoice;

          do
          {
             Console.WriteLine("1 - " + "Microsoft Server Speech Text to Speech Voice (en-US, ZiraPro)");
             Console.WriteLine("2 - " + "Microsoft Server Speech Text to Speech Voice (en-GB, Hazel)");
             Console.WriteLine("3 - " + "Microsoft Server Speech Text to Speech Voice (es-ES, Helena)");
             Console.WriteLine("4 - " + "Microsoft Server Speech Text to Speech Voice (fr-FR, Hortense)");
             Console.WriteLine("5 - " + "Exit");
             Console.Write("Enter the number of your choice: ");     //the user chooses a number
             userChoice = Console.ReadLine();

             if (!Int32.TryParse(userChoice, out num)) continue;

             Console.WriteLine("Choice = " + userChoice);

             if (userChoice == "1")    //Option 1 will use the voice en-US, ZiraPro
             {
                synth.SelectVoice("Microsoft Server Speech Text to Speech Voice (en-US, ZiraPro)");
             }

             if (userChoice == "2")   //Option 2 will use the voice en-GB, Hazel
             {
                synth.SelectVoice("Microsoft Server Speech Text to Speech Voice (en-GB, Hazel)");
             }

             if (userChoice == "3")   //Option 3 will use the voice es-ES, Helena
             {
                synth.SelectVoice("Microsoft Server Speech Text to Speech Voice (es-ES, Helena)");
             }

             if (userChoice == "4")   //Option 4 will use the voice fr-FR, Hortense
             {
                synth.SelectVoice("Microsoft Server Speech Text to Speech Voice (fr-FR, Hortense)");
             }

             if (userChoice == "5")   //Option 5 will exit application
             {
                Environment.Exit(0);
             }

             synth.SetOutputToDefaultAudioDevice();   //set the default audio output

             foreach (InstalledVoice voice in synth.GetInstalledVoices())   //list the installed voices details
             {
                VoiceInfo info = voice.VoiceInfo;

                Console.WriteLine(" Name:          " + info.Name);
                synth.Speak("Name: " + info.Name);
                Console.WriteLine(" Culture:       " + info.Culture);
                synth.Speak("Culture: " + info.Culture);
                Console.WriteLine(" Age:           " + info.Age);
                synth.Speak("Age: " + info.Age);
                Console.WriteLine(" Gender:        " + info.Gender);
                synth.Speak("Gender: " + info.Gender);
                Console.WriteLine(" Description:   " + info.Description);
                Console.WriteLine(" ID:            " + info.Id + "\n");
                synth.Speak("ID: " + info.Id);
             }

             Console.ReadKey();

          }
          while (true);
    }
  }
}

Could somebody explain me the differences between the two of them ?

Looking at the docs for `Microsoft.Speech.Synthesis` I see this comment: _We're no longer updating this content regularly. Check the Microsoft Product Lifecycle for information about how this product, service, technology, or API is supported._ So I can only assume that `Microsoft.Speech.Synthesis` is deprecated in favor the (presumably) newer `System.Speech.Synthesis`. — Chris Dunaway, Feb 06 '19 at 15:42
Actually, they're *both* deprecated in favor of the [`Windows.Media.SpeechSynthesis`](https://learn.microsoft.com/en-us/uwp/api/Windows.Media.SpeechSynthesis) API. — Eric Brown, Apr 08 '19 at 18:40

score 2 · Accepted Answer · answered Apr 08 '19 at 18:46

2

The difference really is pretty much as outlined in the linked answer; System.Speech.SpeechSynthesis uses the desktop TTS engines, while Microsoft.Speech.SpeechSynthesis uses the server TTS engines. The differences are relatively minor from the programming perspective, but considerably different from the licensing perspective; the server TTS engines are separately licensed.

However, both System.Speech.SpeechSynthesis and Microsoft.Speech.SpeechSynthesis are deprecated APIs, and new development should be based on the Windows.Media.SpeechSynthesis API.

answered Apr 08 '19 at 18:46

Eric Brown

13,774
7
30
71

Thanks for your answer. I finally decided myself to use Microsoft SAPI. I know it is deprecated too, but it is easier to use than Windows.Media.SpeechSynthesis and there are more functions available. – georges619 May 03 '19 at 08:07
Can someone point me to the page that says these APIs are deprecated? This one still seems up to date - https://learn.microsoft.com/en-us/dotnet/api/system.speech.synthesis?view=netframework-4.8 – Michael Levy May 14 '19 at 14:28
And the product life cycle pages mention "Speech Server", but not "Microsoft Speech Platform". Is there any link to the ongoing support of Microsoft Speech Platform (microsoft.speech)? Is it tied to a specific version of Windows Server? – Michael Levy May 14 '19 at 14:35
I broke this off into a separate SO question - https://stackoverflow.com/questions/56133281/what-is-the-support-lifecycle-for-microsoft-speech-platform-v11 – Michael Levy May 14 '19 at 14:52

What is the difference between System.Speech.Synthesis and Microsoft.Speech.Synthesis?

1 Answers1