4

I need to add compression to my project and I decided to use the LZJB algorithm that is fast and the code is small. Found this library https://github.com/copy/jslzjb-k

But the API is not very nice because to decompress the file you need input buffer length (because Uint8Array is not dynamic you need to allocate some data). So I want to save the length of the input buffer as the first few bytes of Uint8Array so I can extract that value and create output Uint8Array based on that integer value.

I want the function that returns Uint8Array from integer to be generic, maybe save the length of the bytes into the first byte so you know how much data you need to extract to read the integer. I guess I need to extract those bytes and use some bit shifting to get the original number. But I'm not exactly sure how to do this.

So how can I write a generic function that converts an integer into Uint8Array that can be embedded into a bigger array and then extract that number?

jcubic
  • 61,973
  • 54
  • 229
  • 402

3 Answers3

3

General answer

These functions allow any integer (it uses BigInts internally, but can accept Number arguments) to be encoded into, and decoded from, any part of a Uint8Array. It is somewhat overkill, but I wanted to learn how to work with arbitrary-sized integers in JS.

// n can be a bigint or a number
// bs is an optional Uint8Array of sufficient size
//   if unspecified, a large-enough Uint8Array will be allocated
// start (optional) is the offset 
//   where the length-prefixed number will be written
// returns the resulting Uint8Array
function writePrefixedNum(n, bs, start) {
  start = start || 0;
  let len = start+2; // start, length, and 1 byte min
  for (let i=0x100n; i<n; i<<=8n, len ++) /* increment length */;
  if (bs === undefined) {  
    bs = new Uint8Array(len);
  } else if (bs.length < len) {
        throw `byte array too small; ${bs.length} < ${len}`;
  }
  let r = BigInt(n);
  for (let pos = start+1; pos < len; pos++) {
    bs[pos] = Number(r & 0xffn); 
        r >>= 8n;
  }
  bs[start] = len-start-1; // write byte-count to start byte
  return bs;
}

// bs must be a Uint8Array from where the number will be read
// start (optional, defaults to 0)
//    is where the length-prefixed number can be found
// returns a bigint, which can be coerced to int using Number()
function readPrefixedNum(bs, start) {
  start = start || 0;
  let size = bs[start]; // read byte-count from start byte
  let n = 0n;
  if (bs.length < start+size) {
        throw `byte array too small; ${bs.length} < ${start+size}`;
  }    
  for (let pos = start+size; pos >= start+1; pos --) {
    n <<= 8n;
    n |= BigInt(bs[pos])
  }
  return n;
}

function test(n) {
  const array = undefined;
  const offset = 2;
  let bs = writePrefixedNum(n, undefined, offset);
  console.log(bs);
  let result = readPrefixedNum(bs, offset);
  console.log(n, result, "correct?", n == result)
}

test(0)
test(0x1020304050607080n)
test(0x0807060504030201n)

Simple 4-byte answer

This answer encodes 4-byte integers to and from Uint8Arrays.

function intToArray(i) {
    return Uint8Array.of(
      (i&0xff000000)>>24,
      (i&0x00ff0000)>>16,
      (i&0x0000ff00)>> 8,
      (i&0x000000ff)>> 0);
}

function arrayToInt(bs, start) {
    start = start || 0;
    const bytes = bs.subarray(start, start+4); 
    let n = 0;
    for (const byte of bytes.values()) {       
            n = (n<<8)|byte;
    }
    return n;
}

for (let v of [123, 123<<8, 123<<16, 123<<24]) {
  let a = intToArray(v);
  let r = arrayToInt(a, 0);
  console.log(v, a, r);
}
tucuxi
  • 17,561
  • 2
  • 43
  • 74
  • Why `arrayToInt(new Uint8Array([255, 255, 255, 255]))` returns `-1` and not max value? – jcubic Oct 26 '21 at 12:49
  • 1
    @jcubic That's because how negative values are represented in binary. Read about *two's complement*. `arrayToInt(new Uint8Array([127, 255, 255, 255]))` is the maximum 32 bit signed value. – FZs Oct 26 '21 at 13:52
  • `arrayToInt(new Uint8Array([127, 255, 255, 255]))` returns `0`. – jcubic Oct 26 '21 at 18:53
  • There are 2 arguments to arrayToInt (because you said that you wanted to extract numbers from larger byte arrays). If you specify the argument (or use the new version that defaults it to 0), it works -- because, asides from byte order, it does exactly the same as your function. – tucuxi Oct 27 '21 at 04:59
  • ok, sorry didn't notice the argument because when I tested your solution the first time I used `function arrayToInt(bs, start = 0) {` that's why I've commented that it returned -1 and you commented with the same code and didn't correct it with the second argument. – jcubic Oct 27 '21 at 07:43
  • In fact, I think I've removed the `bs.subarray` when tested the first time. – jcubic Oct 27 '21 at 07:55
  • FYI, if you're converting because you send binary data to a C# backend, the resulting `Uint8Array` needs to have `.reverse()` called on it otherwise C#'s `BitConverter.ToInt32` will not give you the correct output. – Reahreic May 30 '23 at 14:07
3

Here are working functions (based on Converting javascript Integer to byte array and back)


function numberToBytes(number) {
    // you can use constant number of bytes by using 8 or 4
    const len = Math.ceil(Math.log2(number) / 8);
    const byteArray = new Uint8Array(len);

    for (let index = 0; index < byteArray.length; index++) {
        const byte = number & 0xff;
        byteArray[index] = byte;
        number = (number - byte) / 256;
    }

    return byteArray;
}

function bytesToNumber(byteArray) {
    let result = 0;
    for (let i = byteArray.length - 1; i >= 0; i--) {
        result = (result * 256) + byteArray[i];
    }

    return result;
}

by using const len = Math.ceil(Math.log2(number) / 8); the array have only bytes needed. If you want a fixed size you can use a constant 8 or 4. In my case, I just saved the length of the bytes in the first byte.

jcubic
  • 61,973
  • 54
  • 229
  • 402
  • You have hidden the extra complexity of having to store/read a 1st byte indicating the length of the resulting byte-array. Also note that the JS Number type cannot accurately represent numbers over ~ 2^53, so you will never need more than 7 bytes or less than 1. My answer is limited to the (common) case of numbers in the range 0-2^31, avoiding that complexity by assuming 4-byte integers. – tucuxi Oct 26 '21 at 14:38
  • @tucuxi I want this to be compression for files, if I will use your approach I will only be able to save files about 1.9GB also your code doesn't work for your example in the comment. – jcubic Oct 26 '21 at 18:53
  • @tucuxi I also prefer my solution because it actually works. – jcubic Oct 26 '21 at 18:55
  • please explain how your answer works beyond 2^53 - 1 (limit of the `number` argument that you are passing in). You were calling `arrayToInt` with only 1 argument. I have defaulted this argument to 0. It worked before (when passing in all arguments), now it also works even when you do not look at the code or read the examples. – tucuxi Oct 27 '21 at 05:02
  • @tucuxi I prefer my solution because it saves up to 7 bytes (to don't lose precision) and if someone wants a constant number of bytes he can use a constant for `len` 8 or 4. – jcubic Oct 27 '21 at 07:46
  • Added a general version that works for any byte count, at any offset, and actually writes and reads the prefixes too. – tucuxi Oct 27 '21 at 09:27
  • @tucuxi thanks for my use case it would need to be heavily refactored to be usable, but it may be useful to future visitors. The general idea of how to implement this is there. – jcubic Oct 27 '21 at 09:47
0

Posting this one-liner in case it is useful to anyone who is looking to work with numbers below 2^53. This strictly uses bitwise operations and has no need for constants or values other than the input to be defined.

export const encodeUvarint = (n: number): Uint8Array => n >= 0x80 
    ? Uint8Array.from([(n & 0x7f) | 0x80, ...encodeUvarint(n >> 7)]) 
    : Uint8Array.from([n & 0xff]);
Ahm23
  • 338
  • 3
  • 16