2

I am sorry for troubling you when you are busy. I don't mind any hints or expectations, so I would appreciate it if you could tell me. this is the first time for me to question here. I am a Japanese student.

This time, I want to remake watson-voice-bot, the open source code for IBM Cloud, for Japanese. However, I've tried a lot of things for a long time, but they only speak English. I will continue to challenge, but I would be very happy if someone could give me some advice.

This code is made using the apis of three services: speech to text, text to speech, and watson assistant. The main function is a chatbot that gets the human voice on the web site and the AI responds by voice.

https://www.youtube.com/watch?v=umf5egQPPRI

What is certain now is to set the url for a foreign language in the source code to convert it to English.

So I tried to set the environment variables properly, but I remained English-speaking. Of course, natural language data sets have Japanese.

recorder.js and welcome.py, etc. tried various things, such as changing the value i think suspicious, but the chat bot itself will not work.

Especially in recorder.js, I thought that values such as recordeng would lead to a solution, but i can't come up with a clear solution because I don't have enough knowledge about Javascript.

Part of recorder.js

  recorder.setupDownload = function(blob){

    if($('#isRecording').prop('value')=='recordEng')
    {
      window.postEnglishAudio(blob);
      var url = (window.URL || window.webkitURL).createObjectURL(blob);
      var link = document.getElementById("saveEnglish");
      link.href = url;
      link.download = 'EnglishRecording.wav';
      link.target = '_blank';
    }
    else {
      window.postHindiAudio(blob);
      var url = (window.URL || window.webkitURL).createObjectURL(blob);
      var link = document.getElementById("saveHindi");
      link.href = url;
      link.download = 'HindiRecording.wav';
      link.target = '_blank';

    }
    document.getElementById("isRecording").value="none";
    // var url = (window.URL || window.webkitURL).createObjectURL(blob);
    // var link = document.getElementById("save");
    // link.href = url;
    // link.download = filename || 'output.wav';

  }

Part of welcome.py

@app.route('/api/text-to-speech', methods=['POST'])
def getSpeechFromText():
    tts_kwargs = {
            'username': textToSpeechUser,
            'password': textToSpeechPassword,
            'iam_apikey': textToSpeechIAMKey,
            'url': textToSpeechUrl
    }

    inputText = request.form.get('text')
    ttsService = TextToSpeechV1(**tts_kwargs)

    def generate():
        audioOut = ttsService.synthesize(
            inputText,
            'audio/wav',
            'ja-JP_EmiVoice').get_result()

        data = audioOut.content

        yield data

    return Response(response=generate(), mimetype="audio/x-wav")
chughts
  • 4,210
  • 2
  • 14
  • 27
  • All three of the services you are using support Japanese, so if you are getting English then you will have created an English Assistant / Skill, and set English as the language for the STT and TTS services. To fix this create your Assistant Skill in Japanese. It looks like you are calling `postEnglishAudio` in your JavaScript, check that function this sets up the correct language for STT. Your TTS synthesise looks OK, so I suspect you are hearing English in a Japanese accent. – chughts Jul 30 '19 at 11:05

1 Answers1

1

Please show your TTS code, as the error will most likely be because you are not correctly setting the language model in the STT Recognize method.

In the TTS Synthesize methods that you do show, it's possible to tell what you are doing wrong. You are specifying positional arguments, but if you take a look at the Python SDK - https://github.com/watson-developer-cloud/python-sdk/blob/master/ibm_watson/text_to_speech_v1.py

 def synthesize(self,
                text,
                voice=None,
                customization_id=None,
                accept=None,
                **kwargs):

The order is text, then voice. You should specify the arguments as keywords, as shown in the API documentation - https://cloud.ibm.com/apidocs/text-to-speech?code=python#synthesize-audio-get

so in your case:

 audioOut = ttsService.synthesize(
          inputText,
          accept = 'audio/wav',
          voice = 'ja-JP_EmiVoice').get_result()

For a full example of how to use the Python SDK for Watson take a look at the API Documentation - https://cloud.ibm.com/apidocs/text-to-speech?code=python#synthesize-audio

Your code should look something like -


from ibm_watson import TextToSpeechV1

@app.route('/api/text-to-speech', methods=['POST'])
def getSpeechFromText():
    tts_kwargs = {
            'username': textToSpeechUser,
            'password': textToSpeechPassword,
            'iam_apikey': textToSpeechIAMKey,
            'url': textToSpeechUrl
    }

    inputText = request.form.get('text')
    ttsService = TextToSpeechV1(
                    iam_apikey=textToSpeechIAMKey,
                    url=textToSpeechUrl
                  )

    def generate():
        audioOut = ttsService.synthesize(
            inputText,
            voice='ja-JP_EmiVoice',
            accept='audio/wav').get_result()

        data = audioOut.content

        yield data

    return Response(response=generate(), mimetype="audio/x-wav")


chughts
  • 4,210
  • 2
  • 14
  • 27
  • Does tts code mean tts kwargs? I have set up four code from the cloud foundry application runtime. TEXTTOSPEECH_IAM_APIKEY = apikey TEXTTOSPEECH_URL = https://stream.watsonplatform.net/text-to-speech/api/v1/voices/ja-JP_EmiVoice SPEECHTOTEXT_IAM_APIKEY = apikey SPEECHTOTEXT_URL = https://stream.watsonplatform.net/speech-to-text/api/v1/models/ja-JP_BroadbandModel That's all there is to it. Thank you very much for your great advice. – what meaning Jul 31 '19 at 19:57