2

When javascript encodes one character as 2 bytes or more, why does the following:

Buffer.from("abc").byteLength

output 3? Should not it be 6?

Amanda
  • 2,013
  • 3
  • 24
  • 57
  • 1
    I think the default encoding is utf-8, if so then it can take minimum of 8bit per character. A link that might help you more. https://stackoverflow.com/questions/6338944/if-utf-8-is-an-8-bit-encoding-why-does-it-need-1-4-bytes "those magic U+ numbers, in memory using 8 bit bytes. In UTF-8, every code point from 0-127 is stored in a single byte. Only code points 128 and above are stored using 2, 3, in fact, up to 6 bytes." – Andam Dec 26 '19 at 07:47
  • If you mean the from() method of Array and the byteLength property of ArrayBuffer, then I think the explanation is that 3 entries are created from a string with 3 characters since length of 3 character string is 3 (3 code units). If the ArrayBuffer is Uint8Array, then 3 bytes are created for "abc". If instead you use Uint16Array, then 6 bytes are created from "abc". Try out alert(new Uint8Array(Array.from("sss")).byteLength) vs alert(new Uint16Array(Array.from("sss")).byteLength) – Jose_X Apr 02 '21 at 13:37

1 Answers1

0

Javascript class Buffer's default encoding is 'utf-8'. ASCII characters take 1 bytes in utf-8 encoding as you can see here. So the result should be 3. Note: Utf-8 encoding can take 1~3 bytes for one character.

TopW3
  • 1,477
  • 1
  • 8
  • 14