17

I have an asp.net MVC application that has a controller action that takes a string as input and sends a response wav file of the synthesized speech. Here is a simplified example:

    public async Task<ActionResult> Speak(string text)
    {
        Task<FileContentResult> task = Task.Run(() =>
        {
            using (var synth = new System.Speech.Synthesis.SpeechSynthesizer())
            using (var stream = new MemoryStream())
            {
                synth.SetOutputToWaveStream(stream);
                synth.Speak(text);
                var bytes = stream.GetBuffer();
                return File(bytes, "audio/x-wav");
            }
        });
        return await task;
    }

The application (and this action method in particular) is running fine in a server environment on 2008 R2 servers, 2012 (non-R2) servers, and my 8.1 dev PC. It is also running fine on a standard Azure 2012 R2 virtual machine. However, when I deploy it to three 2012 R2 servers (its eventual permanent home), the action method never produces an HTTP response -- the IIS Worker process maxes one of the CPU cores indefinitely. There is nothing in the event viewer and nothing jumps out at me when watching the server with Procmon. I've attached to the process with remote debugging, and the synth.Speak(text) never returns. When the synth.Speak(text) call is executed I immediately see the runaway w3wp.exe process in the server's task manager.

My first inclination was to believe some process was interfering with speech synthesis in general on the servers, but the Windows Narrator works correctly, and a simple console app like this also works correctly:

static void Main(string[] args)
{
    var synth = new System.Speech.Synthesis.SpeechSynthesizer();
    synth.Speak("hello");
}

So obviously I can't blame the server's speech synthesis in general. So maybe there is a problem in my code, or something strange in IIS configuration? How can I make this controller action work correctly on these servers?

This is a simple way to test the action method (just have to get the url value right for the routing):

<div>
    <input type="text" id="txt" autofocus />
    <button type="button" id="btn">Speak</button>
</div>

<script>
    document.getElementById('btn').addEventListener('click', function () {
        var text = document.getElementById('txt').value;
        var url = window.location.href + '/speak?text=' + encodeURIComponent(text);
        var audio = document.createElement('audio');
        var canPlayWavFileInAudioElement = audio.canPlayType('audio/wav'); 
        var bgSound = document.createElement('bgsound');
        bgSound.src = url;
        var canPlayBgSoundElement = bgSound.getAttribute('src');

        if (canPlayWavFileInAudioElement) {
            // probably Firefox and Chrome
            audio.setAttribute('src', url);
            audio.setAttribute('autoplay', '');
            document.getElementsByTagName('body')[0].appendChild(audio);
        } else if (canPlayBgSoundElement) {
            // internet explorer
            document.getElementsByTagName('body')[0].appendChild(bgSound);
        } else {
            alert('This browser probably can\'t play a wav file');
        }
    });
</script>
hmqcnoesy
  • 4,165
  • 3
  • 31
  • 47
  • Have you tried making it a synchronous action method, without wrapping it in a task? There could be issues with thread pool in that code that ASP .NET is not aware of. – Dmitry S. Oct 01 '15 at 19:55
  • @DmitryS. I'm not sure I can make a test like that work... if `synth.Speak` isn't wrapped, I get a runtime exception: `InvalidOperationException` (An asynchronous operation cannot be started at this time). I like the line of thinking, but if it was a thread pool issue with ASP.NET, why would it work on so many other servers? – hmqcnoesy Oct 01 '15 at 20:26
  • The latest version of the synthesizer has the `SpeakAsync()` method. Instead of wrapping the code in `Task.Run()` you can just do `await SpeakAsync(text)`. https://msdn.microsoft.com/en-us/library/system.speech.synthesis.speechsynthesizer.speakasync%28v=vs.110%29.aspx – Dmitry S. Oct 01 '15 at 20:37
  • Maybe I'm missing something here - the `SpeakAsync()` method doesn't return a task so it can't be awaited. (It returns a `Prompt` object). – hmqcnoesy Oct 01 '15 at 20:56
  • I did not realize that. But the method is not blocking unlike the `Speak(text)` method. Just swap the method call and see if it prevents the action method call from getting stuck. – Dmitry S. Oct 01 '15 at 21:22
  • Trying `SpeakAsync(text)` also throws the `InvalidOperationException` when not wrapped in a task, on all servers. – hmqcnoesy Oct 02 '15 at 13:47
  • 1
    Did you try suggestion from here: http://peterluzc.blogspot.ru/2014/01/speechsynthesizer-throws-exception-in.html? I know this is about exception and not hang, but still might be related. – Evk Oct 02 '15 at 21:01
  • If you made a console application, why didn't you use the async code to test it? Async runs in a different thread. That thread is different (non-UI thread, not CoInitialized). – Thomas Weller Oct 06 '15 at 19:11
  • @Evk Your comment helped get me going in the right direction. I haven't solved this quite yet, but I'm getting close, and your comment was helpful. Please post it as the answer so I can award the bounty. – hmqcnoesy Oct 08 '15 at 20:02
  • I'm having the same issue. Did you ever find a solution for this? – josibu Feb 14 '21 at 06:54
  • @josibu Yes, see the accepted answer – hmqcnoesy Feb 16 '21 at 14:29

5 Answers5

2

I found that I can reproduce the issue on other servers, including Azure VMs, so I ruled out the possibility of an issue with our particular environment.

Also, I found that I could get the code to work fine on 2012 R2 if I ran the application pool under an identity that was an admin on the server and had previously logged into the server. After a very long process of ruling out permissions issues I decided it must be something in the logging in process that occurs that enables the TTS API calls to work correctly. (Whatever it is, I wasn't able to find it digging through procmon traces). So fortunately the ApplicationPoolIdentity can have similar login magic applied by opening "Advanced Settings" for the app pool in IIS and setting Load User Profile to True.

The identity that runs the app pool also needs permission to read HKU\.Default\Software\Microsoft\Speech which can be granted to ApplicationPoolIdentity by using the local server for the location and IIS APPPOOL\.Net v4.5 for the username (where .Net v4.5 is the name of the application pool).

Once read permission to the reg key is granted, and the app pool is configured to load user profile, the above code works fine. Tested on Azure VMs and vanilla 2012 R2 from MSDN ISOs.

hmqcnoesy
  • 4,165
  • 3
  • 31
  • 47
  • Man, you are a hero! This definitely made my week. Thanks a lot! - By the way, `iisreset` was necessary for me to make it take effect. – josibu Feb 16 '21 at 17:24
1

I think the issue is the return type. IIS Express is letting you get away with it, but IIS is not:

Task<FileContentResult>

So if you try:

public async Task<FileContentResult> Speak(string text)
{
    Task<FileContentResult> task = Task.Run(() =>
    {
        using (var synth = new System.Speech.Synthesis.SpeechSynthesizer())
        using (var stream = new MemoryStream())
        {
            synth.SetOutputToWaveStream(stream);
            synth.Speak(text);
            var bytes = stream.GetBuffer();
            return File(bytes, "audio/x-wav");
        }
    });
    return await task;
}

I bet you also need to add the audio/wav MIME Type in IIS.

Mr. B
  • 2,845
  • 1
  • 21
  • 31
  • Unfortunately, no, this doesn't change the 2012 R2 behavior. – hmqcnoesy Aug 27 '15 at 15:02
  • No, the wav MIME types were already configured in IIS. But those are for static files anyway. – hmqcnoesy Aug 27 '15 at 16:45
  • So ISAPI filters need to be configured properly perhaps? – Mr. B Aug 27 '15 at 17:05
  • No, it's not the ISAPI filters. – hmqcnoesy Aug 27 '15 at 18:33
  • Oh, I just thought of something. Make sure when you build your Artifact to put on the 2012 R2 Server, make sure the System.Speech Reference is set to Copy Always, if it isn't already. Then (should have thought of this before) put a try catch around the body of the task and see if it throws, if you haven't already. – Mr. B Aug 27 '15 at 19:00
1

I have had this experience with server 2012R2 before (not the synth api granted, but same issue). I fixed it by using "await task.ConfigureAwait(false)" on all my tasks. See if that works for you.

Good luck.

Matt Clark
  • 1,171
  • 6
  • 12
1

At this blog you can find a solution to a similar problem - exception when using SpeechSynthesizer on fresh Windows 8.1 installation. The problem in that case is with wrong permission entry for CurrentUserLexicon user (which is used by SpeechSynthesizer. To resolve, this blog post suggests to remove permission entry "ALL APPLICATION PACKAGES" from Software\Microsoft\Speech\CurrentUserLexicon registry key.

Evk
  • 98,527
  • 8
  • 141
  • 191
-1

This is just off the top of my head and it hasn't been tested but you may be able to do something like this:

public ActionResult Speak(string text)
{
var speech = new SpeechSynthesizer();
speech.Speak(text);

byte[] bytes;
using (var stream = new MemoryStream())
{
    speech.SetOutputToWaveStream(stream);
    bytes = stream.ToArray();
}
return File(bytes, "audio/x-wav");
}
Manraj
  • 496
  • 2
  • 15