6

How do you take a number representing a list of items, and divide it into chunks, where the number of chunks is a power-of-two, AND where each chunk also has a power-of-two number of items (where size goes up to a max power of two, so 1, 2, 4, 8, 16, 32, 32 being the max)? Is this even possible?

So for example, 8 items could be divided into 1 bucket (power of two bucket) with 8 items (power of two items):

[8]

9 items could be:

[8, 1]

That works because both numbers are powers of two, and the size of the array is 2 (also a power of two).

Let's try 11:

[8, 2, 1]

Nope that doesn't work. Because the size of the array is 3 which is not a power of two, even though it adds to 11. How about this?

[4, 4, 2, 1]

That works! It's 4 elements which is a power of two.

[2, 2, 2, 1, 1, 1, 1, 1]

That also works, since there are 8 buckets (8 is a power of two), with 1 or 2 items each (each a power of two). But [4, 4, 2, 1] is better because it's shorter.

I guess an even better one (after getting comments) would be this, though I didn't see it the first time around:

[8, 1, 1, 1]

That one is short, and also starts with the largest number.

So following this pattern, here are some other numbers:

13:

[8, 1, 1, 1, 1, 1] // no, 6 not power of 2
[8, 2, 1, 1, 1] // no, 5 not power of 2
[8, 2, 2, 1] // yes, 4 is power of 2
[8, 4, 1] // no, 3 not power of 2

14:

[8, 2, 2, 2]

15:

[8, 4, 2, 1]

16:

[16]

18:

[16, 2]

200:

[32, 32, 32, 32, 32, 32, 4, 4]

When the size of the first layer of buckets in the bucket tree grows to longer than 32, then it nests. So take the number 1234 for example. This can be represented by 38 32's followed by 16 followed by 4.

[32, 32, 32, 32, 32, 32, 32, 32,
 32, 32, 32, 32, 32, 32, 32, 32,
 32, 32, 32, 32, 32, 32, 32, 32,
 32, 32, 32, 32, 32, 32, 32, 32,
 32, 32, 32, 32, 32, 32, 16, 4]

But now the bucket size is 40 items long, which isn't a power of two AND it's greater than 32. So it should be nested! I can't quite visualize this in my head, so sorry if this isn't a perfect example, I think you can get the gist of what I mean though.

// the parent "x" is 2 in length
x = [a, b]
// child "a" is 32 in length
a = [32, 32, 32, 32, 32, 32, 32, 32,
     32, 32, 32, 32, 32, 32, 32, 32,
     32, 32, 32, 32, 32, 32, 32, 32,
     32, 32, 32, 32, 32, 32, 32, 32]
// child "b" is 8 in length
b = [32, 32, 32, 32, 32, 32, 16, 4]

Taken another layer up, say we have a very large number (I don't know what the minimum large number is) that requires another layer of nesting. What we can say about this layer is that, x will be 32 in length, but we will also have a y that is at least 1.

_x_ = [a, b, c, d, e, f, g, h,
       i, j, k, l, m, n, o, p,
       q, r, s, t, u, v, w, x,
       y, z, a2, b2, c2, d2, e2, f2]
_y_ = [a3]
a   = [32, 32, 32, ..., ?]
...
f2   = [32, ..., ?]

Then once we have _x_, _y_, _z_, ... 32 total of these, we build another layer on top.

What is an algorithm/equation that will take a number and divide it into this tree of buckets / item sizes that are all powers of two, up to a max (in this case, 32)?

A subgoal is to minimize the number of levels. There isn't a specific limit, but I am imagining no more than 1 million or very max 1 billion nodes in the entire runtime, so it seems like you'll only have 3 or 4 levels probably, I don't know exactly how to calculate it.

This is going to be used for array lookup. Essentially I am breaking apart a single, large, arbitrarily sized "contiguous" array into small contiguous chunks with size power-of-2 up to 32 in length. This balances lookup performance (i.e. fits within cpu cache), with memory fragmentation.

Update:

I think trying to incorporate this somehow to build up that nested tree I'm trying to describe will help. The last thing now missing is getting the nested arrays to be properly sized to power-of-two values...

The best I have been able to do so far is this:

console.log(spread(74))
console.log(spread(85))
console.log(spread(87))
console.log(spread(127))
console.log(spread(1279))
console.log(spread(12790))
console.log(spread(127901))

function spread(n) {
  if (n == 0) {
    return [0, 0, 0, 0, 0, 0]
  }
  let array = []
  let r = split(n)
  while (r[0] > 31) {
    array.push([32, 0, 0, 0, 0, 0])
    r[0] -= 32
  }
  array.push(r)
  let s = sum(r)
  if (!isPowerOf2(s)) {
    let x = pow2ceil(s)
    let i = 1
    while (i < 5) {
      if (r[i] > 1) {
        i++
        break
      }
      i++
    }
    if (i == 5) {
      i = 0
    }
    main:
    while (true) {
      while (r[i]) {
        r[i + 1] += 2
        r[i]--
        s += 1
        if (s == x) {
          break main
        }
      }
      i++
    }
  }

  if (array.length == 1) {
    return array[0]
  } else if (array.length < 33) {
    return array
  } else {
    return divide(array, 32)
  }
}

function sum(arr) {
  let i = 0
  arr.forEach(x => {
    i += x
  })
  return i
}

function split(n) {
  const r = [0, 0, 0, 0, 0, 0]
  let u = 32
  let i = 0
  while (u > 0) {
    while (n >= u) {
      r[i]++
      n -= u
    }
    i += 1
    u >>= 1
  }
  return r
}

function isPowerOf2(v) {
  return v && !(v & (v - 1))
}

function pow2floor(v) {
  var p = 1;
  while (v >>= 1) {
    p <<= 1;
  }
  return p;
}

function pow2ceil(v) {
  var p = 2
  while (v >>= 1) {
    p <<= 1
  }
  return p
}

function divide(data, size) {
  const result = []
  const upSize = data.length / size;

  for (let i = 0; i < data.length; i += size) {
    const chunk = data.slice(i, i + size);
    result.push(chunk)
  }

  if (result.length > size) {
    return divide(result, size)
  }

  return result;
}
גלעד ברקן
  • 23,602
  • 3
  • 25
  • 61
Lance
  • 75,200
  • 93
  • 289
  • 503
  • You lost me at "11 items would be 8,1,1"... And then again at each of the "... does not work". – Yunnosch Sep 18 '20 at 07:51
  • @Yunnosch - Read the comment after that, it's an anti-example. :-) – T.J. Crowder Sep 18 '20 at 07:52
  • Why not "11: 8,1,1,1" or "11: 8,2,1"? The rules here elude me. I propose to augment each example with detailed explanation. – Yunnosch Sep 18 '20 at 07:53
  • 1
    Sorry @Yunnosch I messed up, I fixed `11`. Okay I will provide better explanation. – Lance Sep 18 '20 at 07:54
  • Lance - It would probably help to say why something doesn't work, e.g. (the first one) *"**Nope** that doesn't work, it's three buckets and 3 isn't a power of 2."* Still, it's an interesting question, and one that hits me squarely in my insecurity around not having any significant university-level mathematics or computer science education. :-) – T.J. Crowder Sep 18 '20 at 07:55
  • 1
    @T.J.Crowder Seems it was more of a typo.... But I still do not get why the corrected example is an anti-example.... – Yunnosch Sep 18 '20 at 07:55
  • @Yunnosch - LOL, I read right past the minor detail where that added up to 10 to start with. [8, 2, 1] doesn't work because it's three buckets. But your "11: 8,1,1,1" works. – T.J. Crowder Sep 18 '20 at 07:57
  • Updated with better explanation, let me know if it still needs clarity. – Lance Sep 18 '20 at 07:58
  • 1
    I think I found the piece I missed "number of buckets power of two". My mistake. – Yunnosch Sep 18 '20 at 07:58
  • Without further restrictions or clarifications of the question, isn't there a trivial solution of nesting all children of count 2 and size 2 for even numbers; and for odd numbers, nesting all children of all count 2 and size 2 except for the last, where we have either one child of size 1, or one of size 2 and one of size 1? – גלעד ברקן Sep 18 '20 at 21:08
  • @גלעדברקן I am not sure what you mean, that sounds very advanced :) – Lance Sep 18 '20 at 22:18
  • 1 and 2 are powers of 2. Every number can be expressed as a sum of of 2's (plus one 1 if the number is odd). So if we allow "nesting," meaning a child of the tree can be a root of a subtree that's not restricted in size, a trivial solution just keeps adding children each of size 2 (except for one if the number is odd) until the number is fully expressed. – גלעד ברקן Sep 18 '20 at 22:34
  • @גלעדברקן mind showing an algorithm for that? Or is that what you already answered :) just catching up! – Lance Sep 19 '20 at 00:00
  • Should every path in the tree, from root to leaf, have the same length, or would you prefer to have some shorter paths? For example, if a tree is formed with an internal node that would be an array with just *one* element, we could replace that node with its child, making paths via that node one step shorter, while other paths would remain unaltered. – trincot Sep 19 '20 at 10:54
  • @trincot interesting, I am trying to imagine, not super advanced with this myself haha, I think whatever would be computationally easiest to construct and find by index with a pattern like https://stackoverflow.com/questions/63967051/how-to-get-the-number-of-arrays-generated-by-dividing-an-array-into-a-tree-with – Lance Sep 19 '20 at 11:04
  • I meant this https://stackoverflow.com/questions/63966270/how-to-find-the-path-to-a-node-in-a-tree-of-arrays-given-its-flattened-array-in – Lance Sep 19 '20 at 11:15
  • Also, given N = 33, which would be preferable [[1], [32]] or [[1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [1], [2]]? – גלעד ברקן Sep 19 '20 at 11:34
  • [[1], [32]] would be preferrable – Lance Sep 19 '20 at 11:38

3 Answers3

3

It's always possible.

Start with the normal binary representation.

You get a number of summands that are all powers of 2.

So the problem is the number of summands is most of the times not a power of two.

You can always get an extra summand by splitting a power of 2 in 2 summand (also powers of 2). Only exception is 1.

So the question is there a case where not enough summand > 1 exists?

Answer: No

Worst case is we have n summand where n is a (power of 2)-1. E.g. 3, 7,15, ... Is we have 3 summand the smallest possible case is 1+2+4. We need 4 summand, so we must create 1 extra summand by splitting one of the summands >1 into two. e.g 1+1+1+4.

For bigger values the highest summand is always >= ceeling(value/2) and has at most ceeling(sqrt(value))+1 summands in binary representation.

ceeling(value/2) grows much faster than sqrt(value).

So we have with increasing values always plenty of summands to split to reach the power of 2 summands goal.


Example: value= 63
Binary representation: 32+16+8+4+2+1 (6 summands)
Highest summand is 32 (at least value/2) (which can be split in any number of summands (all powers of 2) up to 32 summands.

We need at most ceeling(sqrt(63))+1 = 8 summands to reach a power of 2 summands.

So we need 2 extra summands for our 32+16+8+4+2+1

Take any summand >1 and split it in two summands (powers of 2) e.g 32 = 16+16
=> 16+16+16+8+4+2+1 (7 summands)
do it again (because we needed 2 extra summands). Take any summand >1 e.g. 4 and split ist 2+2=4
=> 16+16+16+8+2+2+2+1 (8 summands)

MrSmith42
  • 9,961
  • 6
  • 38
  • 49
  • 3
    This doesn't answer all of the question. Can you make this into an algorithm for constructing the tree? I understand you base points, but am struggling to convert this into a function. How do you get the binary representation such that it is capped at 32? etc. – Lance Sep 18 '20 at 11:14
  • @LancePollard if I understand your question correctly, those are just the bits. E.g. something like `int power = 1; while (n > 0) {if (n % 2 == 1) {numbers.add(power);} n /= 2; power *= 2;}` – maraca Sep 18 '20 at 13:23
2

Here is a possible algorithm:

Check the lowest 5 bits of the input number n and generate the corresponding powers of 2 in an array. So for instance for n = 13 we get [1, 4, 8]

Divide n by 32 ignoring the above-mentioned bits (floor).

Add to the above array as many values of 32 as n modulo 32. So for example for input = 77 we get [1, 4, 8, 32, 32]

Most of the times this array will have a length that does not exceed 32, but it could go up to 36: [1, 2, 4, 8, 16, 32, ..., 32]. If that happens, extract 16 values from the end of the array and store them in a "carry": this carry will be processed later. So not considering this potential carry, we ensure we end up with an array that is not longer than 32.

Then perform a split in halves of the greatest value in the array (thereby growing the array with one unit) until the array's length is a power of 2. For instance, for the case of 77 we'll have a few of such iterations in order to get [1, 4, 8, 8, 8, 16, 16, 16]

Divide n again by 32 ignoring the remainder (floor).

Consider again n modulo 32. If this is non-zero we have found summands of 32*32, so we create a new array [32, ..., 32] for each of those, and combine that with the previously established array into a new tree. So for instance for 1037, we could get

[
  [1,4,4,4],
  [32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32,32]
]

If there is room to add a potential carry (i.e. the top level array does not have a length of 32), then do so.

If the length of the array is not yet a power of 2, apply a similar algorithm as previously mentioned, although now a split in half concerns arrays instead of primitive values.

Repeat this division by 32 to identify even higher nested summands, so these are complete 32*32*32 trees, then in the next iteration, these are complete 324 trees, etc, until all of n is accounted for.

At the end, check if the carry is still there and could not yet be added somewhere: if this is the case add an extra level to the tree (at the top) to combine the achieved result with this carry, so they are siblings in an array of 2.

Implementation

Here is an interactive snippet: type a number and the resulting tree will be displayed in real time. Note that the nested tree is really created in this implementation and will consume quite some memory, so to keep the response times acceptable in JavaScript, I have limited the allowed input to numbers with 7 digits, but in theory the only limit is memory and floating point precision (which in this script is guaranteed up to 253).

// Utility functions
const sum = arr => arr.reduce((a, b) => a+b, 0);
const powersOf2andZero = [0,1,2,4,8,16,32];
const clone = tree => JSON.parse(JSON.stringify(tree));

function createTree(n) {
    let tree = [];
    let carry = null;
    // Isolate 5 least significant bits
    for (let pow of [1, 2, 4, 8, 16]) if (n & pow) tree.push(pow);
    n = Math.floor(n / 32);
    for (let i = n % 32; i > 0; i--) tree.push(32);
    // This array could have more than 32 values, up to 36.
    //   If that happens: remove 16 occurrences of 32, and consider this as carry-over for later treatment.
    if (tree.length > 32) carry = tree.splice(-16); // pop off 16 x 32s.
    // Make the array length a power of 2 by splitting the greatest value (repeatedly)
    let j = tree.length;
    while (!powersOf2andZero.includes(tree.length)) {
        if (j >= tree.length) j = tree.indexOf(tree[tree.length - 1]); // first occurrence of greatest
        // Split greatest value into 2; keep list sorted
        tree.splice(j, 1, tree[j] / 2, tree[j] / 2); // delete, and insert twice the half at same spot
        j += 2;
    }
    // Isolate and count factors of 32, 32², 32³, ...etc. 
    //   Add a superiour level in the tree for each power of 32 that is found:
    n = Math.floor(n / 32);
    let template = 32;
    while (n) {
        if (tree.length > 1) tree = [tree]; // nest
        if (n % 32 < 31 && carry !== null) { // we have room to dump the carry here
            tree.push(carry);
            carry = null;
        }
        template = Array(32).fill(template); // nest the template tree, "multiplying" by 32.
        for (let i = n % 32; i > 0; i--) tree.push(clone(template));
        if (tree.length === 1 && typeof tree[0] !== "number") tree = tree[0]; // Eliminate useless array wrapper
        // Turn this top level into a length that is a power of 2 by splitting the longest array (repeatedly)
        let j = tree.length;
        while (!powersOf2andZero.includes(tree.length)) {
            if (j >= tree.length) j = tree.findIndex(elem => elem.length === tree[tree.length - 1].length);
            // Split longest array into 2; keep list sorted
            let size = tree[j].length / 2;
            tree.splice(j, 1, tree[j].slice(0, size), tree[j].slice(size)); // delete, and insert twice the half
            j += 2;
        }
        n = Math.floor(n / 32);
    }
    // Is the carry still there? Then we cannot avoid adding a level for it
    if (carry) return [tree, carry];
    return tree;
}

// I/O handling
let input = document.querySelector("input");
let output = document.querySelector("pre");

(input.oninput = function () {
    let n = +input.value;
    if (isNaN(n) || n % 1 !== 0 || n < 1 || n > 9999999) {
        output.textContent = "Please enter an integer between 1 and 9999999";
    } else {
        let tree = createTree(n);
        output.textContent = pretty(tree);
    }
})();

function pretty(tree) {
    return JSON.stringify(tree, null, 2)
           .replace(/\[\s*\d+\s*(,\s*\d+\s*)*\]/g, m => m.replace(/\s+/g, ""))
           .replace(/\b\d\b/g, " $&");
}
pre { font-size: 8px }
n: <input type="number" value="927">
<pre></pre>
trincot
  • 317,000
  • 35
  • 244
  • 286
  • Is this method minimising the number of levels? – גלעד ברקן Sep 19 '20 at 21:07
  • I went through the algorithm and feel quite confident that it produces the minimum number of levels. – trincot Sep 19 '20 at 21:41
  • Cool, very nice! – גלעד ברקן Sep 19 '20 at 21:45
  • @trincot this is great, however would you mind explaining the reasoning behind some of your decisions? I haven't looked at this in a little while but not sure why the complicated decisions were made or how you figured this out. – Lance Dec 31 '20 at 04:12
  • Wow, you come back 3 months later? I have to read again what this is about... Can you be specific at what is unclear to you? I see I wrote quite a bit on how the algorithm works already.... – trincot Dec 31 '20 at 15:00
  • It's unclear to me why you're doing each step. It states _what_ is being done, i.e. "check the lowest 5 bits and generate...", but not _why_. I don't understand the why's in all these algorithm actions haha. I am using this for a memory system but I struggle on this last piece, the array tree functionality :) – Lance Dec 31 '20 at 15:54
  • The 5 bits correspond to the number modulo 32 (32 = 0b100000). The reason why is that your question states that 32 is the maximum array size. Let me know if there is another step that needs clarification. – trincot Dec 31 '20 at 15:56
  • Why do we need the carry-over functionality? "This array could have more than 32 values", but why? "Make the array length a power of 2", why is that? – Lance Dec 31 '20 at 15:58
  • So the first steps of the algorithm could lead in very few cases to an array that has more than 32 values, which would violate the rule that it cannot be longer than 32. I gave an example of such overflow in my answer. And so we extract 16 values from it, so we are certain we are not crossing the 32 limit. The part that we set aside is the "carry". It is kept aside until we find a good spot to insert it back in the data structure that is being built. "Make the array length a power of 2": well that is what your question stated as a rule, right? So we must do that... The "why" is you :) – trincot Dec 31 '20 at 16:19
0

(Note, the following answers the restriction on part size and the restriction on the number of parts being a power of 2. I missed the part about the number of parts also being restricted, indicating nesting. I'll try to get to that next.)

A simple proof that's also a method is that our minimal number of parts, MIN, is M = floor(N / max_power_of_2) plus the number of set bits in the binary representation of N - M*max_power_of_2; and the maximal number of parts, MAX, is N, where each part is 1.

Each time we divide one of the powers of 2, P, in the power-of-two representation of M (which starts as some count of max_power_of_2) or N-M*max_power_of_2, we have one count less of P, and two more of P/2, another power of 2. This action adds just one part to the partition, meaning we can represent any number of parts between MIN and MAX.

Greedy JavaScript code:

function f(n, maxP){
  const maxPowerOf2 = 1 << maxP;
  const m = ~~(n / maxPowerOf2);
  const A = new Array(maxP + 1).fill(0);
  A[maxP] = m;
  
  let nm = n - m * maxPowerOf2;
  let p = 0;
  let bitCount = 0;
  
  while (nm){
    if (nm & 1){
      bitCount += 1;
      A[p] = 1;
    }
    nm >>= 1;
    p += 1;
  }
  
  const min = m + bitCount;
  
  let target = 1;
  
  while (target < min)
    target *= 2;
    
  if (target > n)
    return -1;
  if (target == min)
    return A.map((c, p) => [1 << Number(p), c]);
  if (target == n)
    return [n];
    
  // Target is between MIN and MAX
  target = target - min;

  let i = m ? maxP : p;

  while (target && i > 0){
    if (!A[i]){
      i -= 1;
      continue;
    }

    const max = Math.min(target, A[i]);
    
    A[i] -= max;
    A[i-1] += 2*max;
    target -= max;
    i -= 1;
  }
  
  return target ? -1 : A.map((c, p) => [1 << Number(p), c]);
}

var ns = [74, 85, 87, 127, 1279, 12790, 127901, 63];
var maxP = 5;

for (let n of ns){
  let result = f(n, maxP);
  let [_n, numParts] = result.reduce(([_n, numParts], [p, c]) => [_n + p * c, numParts + c], [0, 0]);

  console.log(n, maxP);
  console.log(JSON.stringify(result));
  console.log(JSON.stringify([_n, numParts]));
  console.log('');
}
גלעד ברקן
  • 23,602
  • 3
  • 25
  • 61
  • I am not as sophisticated as this answer suggests, I am not quite sure what I'm looking at yet. Mind trying to explain it in perhaps more concrete terms, something less purely mathematical? This seems promising but it will take me a while to digest properly. – Lance Sep 19 '20 at 00:05
  • Also I notice this at first glance: `[[1,0],[2,1],[4,1],[8,0],[16,221],[32,289]]`. There can't be any numbers greater than 32, hence the need for the "tree" to be built up through nesting, rather than just a flat array like this. – Lance Sep 19 '20 at 00:07
  • @LancePollard I'll be happy to build the nesting! But first we need more clear restrictions for the nesting. If we are allowed to nest arbitrarily, as long as the count of parts in each level is a power of 2 that's less than or equal to some max, why can't we just trivially have 2 items of size 2 in every level until we are done? (And add one item of size 1 at the very last level if the target number is odd.) – גלעד ברקן Sep 19 '20 at 00:39
  • @LancePollard for example, 7: `{ps: [2, 2], children: [{ps: [2, 1], children: []}]}` – גלעד ברקן Sep 19 '20 at 00:43
  • The reason not to have 2 items of size 2 in every level is because this is going to be used for array lookup. Essentially I am breaking apart a single, large, arbitrarily sized "contiguous" array into small contiguous chunks with size power-of-2 up to 32 in length. This balances lookup performance (i.e. fits within cpu cache), with memory fragmentation. If it was just 2x2 like you are saying, then it would take a lot more steps to lookup a value (I think?) than if each node consists of up to 32 items, or potentially even 64 or 128 nodes (I'm still debating). – Lance Sep 19 '20 at 01:36
  • @LancePollard is an additional goal then to minimise the number of levels? Or is there a specific limit on the number of levels (of nesting)? – גלעד ברקן Sep 19 '20 at 02:33
  • yes that would be a subgoal, to minimize the number of levels. There isn't a specific limit, but I am imagining no more than 1million or very max 1billion nodes in the entire runtime, so it seems like you'll only have 3 or 4 levels probably, I don't know exactly how to calculate it. – Lance Sep 19 '20 at 04:31
  • I don't really understand how your snippet shows the number of levels needed. I find it hard to interpret the output and turn it into a tree. – trincot Sep 19 '20 at 20:52