3

I have downloaded the IVR for beginners tutorial and modified it a little to provide voice responses in Russian (java file's encoding is UTF-8):

@Override
protected void doPost(HttpServletRequest servletRequest, HttpServletResponse servletResponse)
        throws IOException {
    VoiceResponse response = new VoiceResponse.Builder()
            .gather(new Gather.Builder()
                    .action("/menu/show")
                    .numDigits(1)
                    .build())
            .say(new Say.Builder("Привет")
                  .voice(Say.Voice.ALICE)
                  .language(Say.Language.RU_RU)
                  .build())
            .build();

    servletResponse.setContentType("text/xml");
    try {
        servletResponse.getWriter().write(response.toXml());
    } catch (TwiMLException e) {
        throw new RuntimeException(e);
    }
}

However, when I call my number, I hear silence. The console's call log shows question marks instead of Cyrillic characters.

enter image description here

I would appreciate help in solving this problem.

Nikolay Mamaev
  • 1,474
  • 1
  • 12
  • 21
  • 1
    Please "try" to set the char-set (UTF8) explicitly on the response ...and maybe prefer "application/xml" to "text/xml" (which formerly implied ASCII [see](http://www.grauw.nl/blog/entry/489)) ...and if you have a logger (and trust its utf-8 capabilities:), plz log the `response.toXml()`.. – xerx593 Jan 20 '19 at 02:32
  • Thanks @xerx593 for feedback. Tried application/xml - same. Logger.getLogger(Logger.GLOBAL_LOGGER_NAME).log(Level.WARNING, response.toXml()) prints the expected Russian string (Mac OS Terminal). But there's thing I don't understand: I'm printing character codes using the printBytes() method from [Byte Encodings and Strings](https://docs.oracle.com/javase/tutorial/i18n/text/string.html) and get the following: `0xd0 0x9f 0xd1 0x80 0xd0 0xb8 0xd0 0xb2 0xd0 0xb5 0xd1 0x82` (should be `0x04 0x1F 0x04 0x40 0x04 0x38 0x04 0x32 0x04 0x35 0x04 0x42` for "Привет") – Nikolay Mamaev Jan 20 '19 at 05:25
  • 1
    the bytes are ok! (your expectation 'd be utf16 ... https://www.fileformat.info/info/unicode/char/041f/index.htm), but did you: `servletResponse.setCharacterEncoding("UTF8");`? (or equivalent.. https://stackoverflow.com/a/1849080/592355) – xerx593 Jan 20 '19 at 08:00
  • P.S. Reverted response's content type back to `text/xml` - still works fine. I.e. it was response's encoding. – Nikolay Mamaev Jan 20 '19 at 08:35

1 Answers1

2

It appears, you "just" had to:

servletResponse.setCharacterEncoding("UTF-8");

..or:

servletResponse.setContentType("text/xml; charset=UTF-8");

...since the default is assumed as ISO-8859-1. (And I am not deep into TwiML or IVR, but this seems to break/fix things (Cyrillic characters) on a basic level.)

refs:

xerx593
  • 12,237
  • 5
  • 33
  • 64