1

You can see here (Find all possible subset combos in an array?) , that one solution presented is

var sets = (function(input, size){
    var results = [], result, mask, i, total = Math.pow(2, input.length);
    for(mask = size; mask < total; mask++){
        result = [];
        i = input.length - 1; 
        do{
            if( (mask & (1 << i)) !== 0){
            result.push(input[i]);
        }
        }while(i--);
        if( result.length >= size){
        results.push(result);
        }
    }   

return results; 
})(['a','b','c','d','e','f'], 2);
console.log(sets);

Firstly I would like to know exactly what size, and mask are doing. I.e why do we need a second param for our function. Also I did soe research on bitwise operators but am still not sure what and why we are doing

mask & (1 << i)) !== 0

What is this implying?

Community
  • 1
  • 1
Muntasir Alam
  • 1,777
  • 2
  • 17
  • 26
  • Why haven't you asked this in the referred question, say as a comment to the answer? – VisioN Jul 14 '16 at 13:17
  • Its from 2013, and due to large text I will have to write to understand, a new question is much more appropriate. – Muntasir Alam Jul 14 '16 at 13:21
  • `mask & (1 << i)` checks if the i-the bit of mask (interpreted as a signed 32-bit integer) is set to 1. – le_m Jul 14 '16 at 13:44

1 Answers1

3

The size parameter is needed because the original question was to produce subsets of a minimal size (of 2). The parameter allows you to specify that minimal size. This is nice, because you can then use the same code to get all subsets with a minimal size of 3.

The variable mask takes all integer values that can be made with input.length bits. So, represented in binary, mask takes these values:

000000
000001
000010
000011
000100
... etc...
111100
111101
111110
111111

Every 1 bit signals that the corresponding element from the input array should be part of the subset. So the above values of mask represent the following subsets:

[]
['a']
['b']
['a','b']
['c']
... etc ...
['c','d','e','f']
['a','c','d','e','f']
['b','c','d','e','f']
['a','b','c','d','e','f']

It is a bit confusing because we write binary representations with the least-significant bit on the right, while arrays are shown with the 0th element on the left.

From those sub arrays, only those with minimal size elements are retained in the results. So in the case of size = 2, the first 3 are not retained: the first one that stays is ['a','b'].

For that reason the algorithm was a bit improved, and mask does not start from 0, because it already is sure not to find a large enough subset with mask smaller or equal to size. So that is why the for loop starts with that value. But it is not that important. It would also work starting with zero:

for (mask = 0; mask < total; mask++)

Then concerning the if: it tests whether a certain bit is set in the binary representation of mask: for that a single bit (1) is shifted to the left (so it becomes something like 10000 in binary representation) and then it is AND-ed (with &) with mask. For instance, if mask is at a certain moment 101001 and the shifted bit is 10000 then this & operation works like this:

mask: 101001
bit:   10000
------------
AND:  000000

Another example:

mask: 101001
bit:    1000
------------
AND:  001000

So, the if condition tests whether the bit at position i in mask (starting at the right of the binary representation) is set or not. If it is set, the value mask & (1 << i)) will be equal to (1 << i) (like for instance 1000), if it is not set this expression evaluates to 0. So by doing the !== 0 test you effectively test that the bit is set. Only in that case the corresponding element from the input array is added to the subset.

Example run

Let's say we call the function with these arguments:

(['a','b','c'], 2);

Then total = 8, and mask runs from 2 to 7. Put in binary terms, mask takes the following binary values:

010
011
100
101
110
111

Note that every bit ("column") in the above table corresponds to an element of the array, and its value (0 or 1) decides whether that element should be in the sub array.

Other values of mask are not interesting: smaller values (like binary 001 or 000) represent sub-arrays that have too few elements -- we need at least two. Higher values (like 1000, 1001, ...) have too many bits: the left-most bit does not correspond to any element in the input array, since we only have 3 elements in the array. So that means we have all the interesting values for mask in the above table.

The next thing that happens in the code, is finding the 1-bits in a particular value of mask. We use the variable i to denote the 0-based number of the bit (its position starting from the right of the binary representation). This i will start at 2 and decrease until 0 (so 2, 1, and 0: each pointing to one of the three bits in mask):

i = input.length - 1; 
do { ... } while (i--);

For a given bit-number i, the bit's value in mask is extracted with:

mask & (1 << i)

So let's do all of the above with mask = 2 (i.e. 010 in binary). These are the values that are calculated:

result = []
i = 2
(1 << i) == 4 // 100 in binary: a 1-bit shifted twice to the left
mask & (1 << i) == 0 // because mask does not have that bit set 

The next iteration for i:

i = 1
(1 << i) == 2 // 010 in binary: a 1-bit shifted once to the left
mask & (1 << i) == 2 // because mask has that bit set 
result = ['b'] // because 'b' is the value at index i.

The last iteration for i:

i = 0
(1 << i) == 1 // 001 in binary: a 1-bit not shifted at all
mask & (1 << i) == 0 // because mask does not have that bit set 

Now, the length of result is checked, but for this particular value of mask it is too small:

result.length == 1 // see above, so nothing is added to the *results* (with 's')

All has been done for mask == 2, so now the above is repeated for mask == 3 (binary 011):

result = []
i = 2
(1 << i) == 4 // 100 in binary: a 1-bit shifted twice to the left
mask & (1 << i) == 0 // because mask does not have that bit set 

The next iteration for i:

i = 1
(1 << i) == 2 // 010 in binary: a 1-bit shifted once to the left
mask & (1 << i) == 2 // because mask has that bit set 
result = ['b'] // because 'b' is the value at index i.

The last iteration for i (different result here):

i = 0
(1 << i) == 1 // 001 in binary: a 1-bit not shifted at all
mask & (1 << i) == 1 // because mask has that bit set as well (different!)
result = ['b','a'] // because 'a' is the value at index i.

Now, the length of result is checked again, and this time it is big enough:

result.length == 2 // see above
results = [['b','a']]

...etc.

Note that results (plural) is an array of arrays, so eventually it will grow to:

results = [['b','a'], ['c','a'], ['c','b'], ['c','b','a']]

The sub arrays have their elements in reversed order because the loop of i iterates backward. The sub arrays would be ordered normally if i would increment from 0 onwards (which is possible without any issue).

Limitations

Because bit operations like << and & require their arguments to be 32 bit integers, the code at hand will only work for input arrays that have at the most 32 elements.

trincot
  • 317,000
  • 35
  • 244
  • 286
  • Is there a reason total = 2^(input.length), im guessing this is due to binary as well – Muntasir Alam Jul 14 '16 at 14:34
  • Perhaps you could run me through a test case. Say we call (['a','b','c'],2). Now total = 2^(3). So total is 8. Now mask = 2, so we keep looping until mask is 8. i becomes 7... er now what after reading your answer im still a tad bit confused – Muntasir Alam Jul 14 '16 at 14:39
  • I extended my answer. I hope this is detailed enough to get the grips on it. – trincot Jul 14 '16 at 15:12
  • Thanks, could you expand on more on mask & (1 << i). Are these kind of like logic gates? I remember taking a class last semester like that. It says "a< – Muntasir Alam Jul 14 '16 at 17:44
  • `1<<5` is the number 1 (binary 000000000001) shifted 5 times to the left, adding zeroes to the right: binary 000000100000. As you see, 5 zeroes were added to the right. This is equal to *2^5*. With *i*: `1< – trincot Jul 14 '16 at 17:51
  • ahh, and the and sign is doing..? – Muntasir Alam Jul 14 '16 at 18:15
  • It performs a logical AND on each bit position. I explained that with two examples in my answer. The purpose in this algorithm is to extract a particular bit from *mask* and forget about all other bits. – trincot Jul 14 '16 at 18:43
  • What does it mean "because mask has that bit set " – Muntasir Alam Jul 14 '16 at 18:55
  • It means that in the binary representation of the *mask* value, the bit -- at the position given by *i* -- is 1. – trincot Jul 14 '16 at 19:01
  • I have given several examples in my answer. Try to solve this: what is the result of `01010 & 01000` (binary)? – trincot Jul 14 '16 at 19:03
  • & means multiply no? so 10 * 8 = 80 -> 1010000 – Muntasir Alam Jul 14 '16 at 19:08
  • It is not multiplication. As I wrote in my answer it is the **AND** operation. It is a [bit-wise operator](https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Operators/Bitwise_Operators). – trincot Jul 14 '16 at 19:16
  • Oh we learned in school that an AND operation is like A times B. I just remember the truth table and how anything anded with 0 is 0.ok I'll check it out once I'm home – Muntasir Alam Jul 14 '16 at 20:32
  • You can take it as multiplication, but then you must perform the multiplication bit-by-bit: so for `0101 & 0100` it would be 0*0, 1*1, 0*0, 1*0, resulting in `0100`. – trincot Jul 14 '16 at 20:50
  • A programmer's mind gets moulded into this binary bit thinking after some time :) – trincot Jul 15 '16 at 12:48
  • The reason we are thinking of bits in this questions is because we started using &, and << right? – Muntasir Alam Jul 15 '16 at 12:49
  • Not really, it starts when you hear the question *"find subsets"*. That is a binary operation. Imagine you have a bag full of toys, and you are asked to create a subset of it in another (empty) bag: you will take each toy out the first bag and decide (yes or no: binary!) whether to put it in the other bag. Since these are binary choices, die-hard programmers immediately think of using binary representations and bit operations like `&` and `<<`. – trincot Jul 15 '16 at 13:15
  • Would you mind telling me what >-1 does, in the first answer here http://stackoverflow.com/questions/1789945/how-can-i-check-if-one-string-contains-another-substring – Muntasir Alam Jul 15 '16 at 17:15
  • That is something completely different. Check the manual on [indexOf](https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/Array/indexOf): *"...returns the first index at which a given element can be found in the array, or -1 if it is not present."*. So `>-1` is a way to check if the element was found. – trincot Jul 15 '16 at 17:17
  • Why not just !=-1? – Muntasir Alam Jul 15 '16 at 17:18
  • Because it is one character less? But why bother about that? There is seldom only one way to do things. – trincot Jul 15 '16 at 17:24
  • I have a new question related to bit operators, perhaps you could help me out? http://stackoverflow.com/questions/38516811/matching-names-with-corresponding-32-bit-unsigned-integers/38516930?noredirect=1#comment64432019_38516930 – Muntasir Alam Jul 22 '16 at 15:33
  • Apologies they just started responding after I asked you – Muntasir Alam Jul 22 '16 at 17:54
  • I got the answer for that problem! Do you think my solution is super inefficient? Just wondering because it looks a bit ugly – Muntasir Alam Jul 23 '16 at 02:24
  • Sorry to bother you again, but I don't think I will get any more responses on this question I posted and was wondering if you could chip in your two cents http://stackoverflow.com/questions/38542069/find-separation-values-from-a-starting-node – Muntasir Alam Jul 23 '16 at 19:22
  • Do you have time later today to talk about this subset question? I tried coding it on my own today and I got stuck on one specific part, that I think would be better to talk about in chat – Muntasir Alam Jul 25 '16 at 19:27
  • Sure, remind me what was the subset question? Oh wait, you mean *this* question here. Ready for chat: https://chat.stackoverflow.com/rooms/118131/discussion-between-cresjoy-and-trincot – trincot Jul 25 '16 at 19:31
  • I likely wont be home for another 2 hours, Ill let you know – Muntasir Alam Jul 25 '16 at 19:42
  • Actually I'm not busy now I have a break ill be in chat – Muntasir Alam Jul 25 '16 at 19:52
  • Hi, do you have time now? – Muntasir Alam Jul 29 '16 at 15:34
  • Ahh Apologies, I just saw this now. I'll catch you later i guess! – Muntasir Alam Jul 29 '16 at 17:21
  • Hi trincot, I'm free for most of the day today, just let me know when. Thanks – Muntasir Alam Jul 30 '16 at 16:26
  • Hi, got time today? – Muntasir Alam Aug 02 '16 at 13:24
  • Hi Trincot, could you give me your thoughts on this question when you are free.http://stackoverflow.com/questions/38637882/finding-order-between-many-events. The given solution does not work for many of the test cases, and is a bit hard to understand – Muntasir Alam Aug 03 '16 at 12:27
  • Hi, did you miss my comment, or are you busy? Take your time ,but just wondering – Muntasir Alam Aug 04 '16 at 17:30
  • Hey trincot , do you have some free time later? I wanted to discuss this question http://stackoverflow.com/questions/24414700/amazon-water-collected-between-towers – Muntasir Alam Aug 26 '16 at 19:51
  • No worries, I have to go out and do some stuff now for probably three hours just message me whenever. :) – Muntasir Alam Aug 26 '16 at 20:03
  • If your not on then, Ill bug you when you wake up dont worry :) – Muntasir Alam Aug 26 '16 at 20:08
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/121978/discussion-between-muntasir-alam-and-trincot). – Muntasir Alam Aug 27 '16 at 00:00
  • Hi, I have really quick quick question when you have a minute or two :) – Muntasir Alam Sep 01 '16 at 02:40
  • Ok im in chat now – Muntasir Alam Sep 01 '16 at 12:15
  • Hey trincot, got some time? I wanted to ask about this question http://stackoverflow.com/questions/39360926/python-match-the-struct-of-a-list – Muntasir Alam Sep 07 '16 at 04:15
  • Seems the old chat is frozen, how do I make a new one? – Muntasir Alam Sep 08 '16 at 12:35
  • http://chat.stackoverflow.com/rooms/122904/muntasir, I made this chat not sure if you can join it though – Muntasir Alam Sep 08 '16 at 12:41
  • Hi, I'm back in chat – Muntasir Alam Sep 08 '16 at 15:34
  • Couldn't make it. I am available now for 2 hours or so. – trincot Sep 08 '16 at 17:46
  • Here again in chat – Muntasir Alam Sep 08 '16 at 21:22
  • Looks like our timezone is really messing things up haha. Do you have like a MSN or email-chat type thing your usually on? – Muntasir Alam Sep 09 '16 at 02:32
  • I do have some of these, but am not connecting to these during office hours. I am trying to keep an eye on StackOverflow though, even when at the office, but it is no guarantee I can reply quickly. I am in timezone GMT+1 (currently GMT+2 because of daylight saving). – trincot Sep 09 '16 at 07:59
  • Ok just woke up and am in chat – Muntasir Alam Sep 09 '16 at 12:03
  • Going to be home in an hour and on for 3 hrs ish till I fall asleep xd – Muntasir Alam Sep 10 '16 at 00:12
  • That was my night :-) I'm up again. – trincot Sep 10 '16 at 07:03
  • Hi there, got time today? – Muntasir Alam Sep 18 '16 at 14:59
  • Great, i'm in chat – Muntasir Alam Sep 18 '16 at 16:23
  • Sorry, I had unexpected visitors. Maybe now? – trincot Sep 18 '16 at 18:32
  • Hey wondered if you had some time, whenever you are awake? Had a question about this answer specifically http://stackoverflow.com/a/14944957/6757245 – Muntasir Alam Oct 16 '16 at 00:22
  • I have time in the coming hours. Let me know when you are awake :-) – trincot Oct 16 '16 at 10:49