2

I'm trying to convert a string into BASE64 and a charset of utf-16 Big Endian in order to send it using an sms API.

I'm not being able to do so in Javascript.

This is the original js string I want to send in the sms:

const originalString = 'Teste 5% áàÁÀ éèÉÈ íìÍÌ óòÓÒ úùÚÙ çÇ ãà ?!,;';

Using btoa(originalString) I get VGVzdGUgNSUyNSDh4MHAIOnoycgg7ezNzCDz8tPSIPr52tkg58cg48MgPyEsOw== that is not what I need... I used an online converter to that purpose and this is the correct value:

AFQAZQBzAHQAZQAgADUAJQAgAOEA4ADBAMAAIADpAOgAyQDIACAA7QDsAM0AzAAgAPMA8gDTANIAIAD6APkA2gDZACAA5wDHACAA4wDDACAAPwAhACwAOw==

I tested sending an sms with it and it works fine.

Eunito
  • 416
  • 5
  • 22

2 Answers2

3

To get the UTF-16 version of the string, we need to map all its characters to their charCodeAt(0) value.
From there, we can build an Uint16Array that would hold an UTF-16LE text file.
We just need to swap all the items in that Uint16Array to get the UTF-16BE version.

Then it's just a matter to encode that to base64.

const originalString = 'Teste 5% áàÁÀ éèÉÈ íìÍÌ óòÓÒ úùÚÙ çÇ ãà ?!,;';
const expectedString = "AFQAZQBzAHQAZQAgADUAJQAgAOEA4ADBAMAAIADpAOgAyQDIACAA7QDsAM0AzAAgAPMA8gDTANIAIAD6APkA2gDZACAA5wDHACAA4wDDACAAPwAhACwAOw==";

const codePoints = originalString.split('').map( char => char.charCodeAt(0) );
const swapped = codePoints.map( val => (val>>8) | (val<<8) );
const arr_BE = new Uint16Array( swapped );

// ArrayBuffer to base64 borrowed from https://stackoverflow.com/a/42334410/3702797
const result = btoa(
    new Uint8Array(arr_BE.buffer)
      .reduce((data, byte) => data + String.fromCharCode(byte), '')
  );
console.log( 'same strings:', result === expectedString );
console.log( result );
Kaiido
  • 123,334
  • 13
  • 219
  • 285
2

This isn't easy as the encoding UTF16BE has little to no support in javascript.

The challenge is converting the string into a buffer of bytes; once you have it in a buffer, converting it to base64 is easy. One way you can do this is by using a library to add support for UTF16BE, like iconv-lite.

Here is an example you can run in node:

const iconv = require('iconv-lite');
const originalString = 'Teste 5% áàÁÀ éèÉÈ íìÍÌ óòÓÒ úùÚÙ çÇ ãà ?!,;';
const buffer = iconv.encode(originalString, 'utf16be');
console.log(buffer.toString('base64'));

You can see a demo of it here: https://repl.it/@RobBrander/SelfishForkedAlphatest

Also, here is a great explanation of base64 encoding of UTF16BE: https://crawshaw.io/blog/utf7

Rob Brander
  • 3,702
  • 1
  • 20
  • 33