btoa
expects an input string representing binary data, which it will then base64 encode.
However, if the input contains multibyte characters it will throw an error because btoa
only directly supports input characters within a the Latin1 range of Unicode.
const ok = "a";
console.log(ok.codePointAt(0).toString(16)); // 61: occupies < 1 byte
const notOK = "✓";
console.log(notOK.codePointAt(0).toString(16)); // 2713: occupies > 1 byte
console.log(btoa(ok)); // YQ==
console.log(btoa(notOK)); // error
But why is this the case? Why couldn't btoa
simply treat the input string as a sequence of bytes and encode each byte one by one, ignoring what the bytes mean?