2

I am trying to make a simple POST request to synthesize speech from plain text using AWS Polly REST API in browser javascript. I am not using the AWS JS SDK due to some external reasons. This is my request:

 $.ajax({
        url: 'https://polly.us-west-2.amazonaws.com/v1/speech',

        type: 'POST',

        data:'{"OutputFormat":"mp3","Text":"Some text to listen","TextType":"text","VoiceId":"Joanna"}',

        dataType: 'text',

        beforeSend: function(xhr){

           xhr.setRequestHeader('Authorization', '<String>');

           xhr.setRequestHeader('Content-Type', 'application/x-www-form-urlencoded; charset=UTF-8');
        },

        success: function(result){
           console.log(result);
        }

});

The request succeeds but when printing the result I receive the following:

ID3#TSSELavf57.56.101��`�ù�CNX�DDDGwws����'��wDDDB��D/��?����!+��������....... a bunch of random data.

I tried encoding it and manipulating it in some way but nothing worked. I went through all the AWS Polly documentation, most of the stackoverflow posts but no result. The docs says that I need to receive AudioStream in a specific format my response is just an unreadable string.

Any ideas?

Thank you!

Here is the docs if they can help you understand the problem better: http://docs.aws.amazon.com/polly/latest/dg/API_SynthesizeSpeech.html

Updated

The problem was fixed by changing the response typo to blob without using ajax but instead doing it with a native javascript POST request.

var xhr = new XMLHttpRequest();
xhr.addEventListener("readystatechange", function(){
   // something
}
xhr.open('POST')
xhr.setRequestHeader(...)
xhr.responseType = 'blob'
Simon
  • 23
  • 1
  • 4

1 Answers1

1

More updated: Seems like you need to do some shuffling. This looks close: https://stackoverflow.com/a/23082623/1483006 Will update with an example once I get to my desk.

Updated: So, you're getting the correct result from the API. It's sending you an mp3 file of the requested speech. Your server application should then return this mp3 to the calling web browser with "Content-Type: audio/mpeg".

An example to wrap all of this up; in the client (web browser) create an audio element with its source url pointing to your server application. In the handler for your application, retrieve the mp3 from Polly and send it back with the appropriate header ("Content-Type: audio/mpeg"). Should be pretty straightforward.

Old: Might I highly recommend letting a library do this for you: https://github.com/ejbeaty/ChattyKathy

At the very least, peruse it for ideas on how to implement this yourself.

Community
  • 1
  • 1
John Jones
  • 2,027
  • 16
  • 25
  • I am implementing it without a library because I want to sign the request on the server side and actually make it on the client using http://docs.aws.amazon.com/general/latest/gr/signing_aws_api_requests.html. That is why I want to convert the above string into a format which can be played. – Simon Apr 10 '17 at 22:35
  • The "string" isn't a string. It's the raw bytes of the mp3 audio. You need to find a way to send it to the client, and how that would be accomplished depends on how the client is making the request to you -- the request that triggers you to call Polly. – Michael - sqlbot Apr 10 '17 at 23:19
  • The above code is on the client. The server has nothing to do with polly and sending request. The request to polly is being done on the client and the data is received there. But how to play it in a html audio tag? – Simon Apr 11 '17 at 08:18