2

I have two arrays: users and data. Both arrays contain a series of identifiers unique to a user, and both arrays are exactly the same:

var users = [1,2,3,4]
var data = [1,2,3,4]

I would like to randomly sort these values into pairs, but I would like to avoid matching equivalent values in each pair. For example, [1,2] would be acceptable, but [1,1] would not.

I have considered using a while loop containing logic to sort the arrays as follows:

var users = [1,2,3,4]
var data = [1,2,3,4]
var pairs = []
var l = users.length

while(pairs.length < l) {
    var a = Math.floor(Math.random() * users.length)
    var b = Math.floor(Math.random() * users.length)

    // I appreciate this bit wouldn't quite work, but hopefully you get the idea

    if(users[a] !== data[b]) {
        var u = users.splice[a, 1]
        var d = data.splice[b, 1]
        pairs[pairs.length] = [u, d]
    }
}

Most of the time this should work, but it's not completely infallible. If, at the end, the only two remaining values are the same, the loop will continue indefinitely.

Edit:

At the end, I would like to have a series of nested arrays, with each value in the nested array being unique. For example:

pairs =[[1, 4], [2, 3], [3, 2], [4, 1]]

Ideally I would like the order/combinations in these nested arrays to be random every time.

Tom Davies
  • 151
  • 1
  • 8
  • For starters, you wouldn't use a while loop – baao Apr 10 '20 at 10:05
  • Anyway, your code has some issues. Have you checked that they actually works? – yqlim Apr 10 '20 at 10:06
  • What is the exact desired output? Is 1,2 considered equal 2,1? – baao Apr 10 '20 at 10:06
  • No - this is just a hypothetical at the moment, because I know that it definitely wouldn't work – Tom Davies Apr 10 '20 at 10:07
  • @baao hopefully clarified it for you in the question. Thanks you! – Tom Davies Apr 10 '20 at 10:11
  • 1
    You could sort both, reverse one and then zip them. It has a higher guarantee of success and you can fudge the data more easily so it doesn't match. However, it's not going to be *random*. For a random matching algorithm, you might need to do a step-through each pair and then backtrack on a matching set, potentially backhracking multiple steps and taking new decisions. So, some form of a depth first search could find a solution. It's probably going to involve more work. If you know properties of your arrays, you might get a slightly better algorithm with a guaranteed success. – VLAZ Apr 10 '20 at 10:12
  • [This ME answer](https://math.stackexchange.com/a/2202120/26180) provides n algorithm to generate all permutations without fixed points. Order can be randomized by randomizing the selection from the respective candidate set. – collapsar Apr 10 '20 at 10:13
  • There is also the brute force way which is to generate every possible pairs and discard if the result contains a matching one. Depending on your datasets, this can also work. I've done something similar before and for small enough inputs you don't care about the enormous amount of generated results. If you do, you can generate them sequentially and discard in order to not keep all in memory. – VLAZ Apr 10 '20 at 10:14
  • @VLAZ I'm beginning to think that Brute force might be the way to go. Maybe a slightly faster variation might be to work out what the remaining values are when we're down to the final pair? Eg `if(users.length === 1 && users[0] === data[0]) { put everything back to the start again } – Tom Davies Apr 10 '20 at 11:12

5 Answers5

1

I believe the following works and is reliable.

function make_pairs() {

    //set arrays - they can be the same, or different, and order is unimportant
    let a = [1,2,3,4],
        b = [1,2,3,4];

    //shuffle the b array
    b.sort(() => Math.floor(Math.random() * 2));

    //map a values to b values - each time we choose a b value,
    //remove it from the pool
    let pairs = a.map(val => {
            let val2_index = b[0] !== val ? 0 : 1;
            val2 = b[val2_index];
            if (!val2) val2 = val;
            b.splice(val2_index, 1);
            return [val, val2];
        });

    //sometimes we'll get a clash with the final pair
    //swap the final pair's b value with the first pair's b value if so
    let final_pair = pairs[pairs.length-1];
    if (final_pair[0] === final_pair[1]) {
        let tmp = final_pair[1];
        final_pair[1] = pairs[0][1];
        pairs[0][1] = tmp;
    }

    return pair;
}

I did testing on 1000+ iterations and didn't get any:

  • clashes (a and b values ending up the same)
  • repeats (the same b value used multiple times)

This can be verified by extending the function by putting the below before the return:

//check for problems
let b_vals = [], problem;
pairs.forEach(pair => {
    b_vals.push(pair[1]);
    if (pair[0] == pair[1]) problem = 'clash!';
});
if (new Set(b_vals).size !== b_vals.length) problem = 'dups!';
if (problem) console.error(problem);
console.log(pairs.join('|'));

(Since Set demands unique values, we can use that and compare it against the number of b values to see if the same b value was used twice.)

Fiddle

Full explanation of approach (as blog post)

Mitya
  • 33,629
  • 9
  • 60
  • 107
  • `b.sort(() => Math.floor(Math.random() * 2));` is a bad sorting algorithm. It only produces `0` and `1`, never `-1`. Also, what's worse is that *it doesn't shuffle well* even if it produced three values. The problem is that the comparison value is generated *every comparison* and it's thus not constant. This means that when the sorting algorithm decides to optimise the number comparisons, you'd get inconsistent results. `1, 2 -> 0` and `2, 3 -> 0`, therefore it can infer `1 = 3` and never compare them. – VLAZ Apr 10 '20 at 11:04
  • Fair enough. I mean, the shuffle isn't the crux of my answer so I'll leave it as-is and let the OP use [whichever](https://stackoverflow.com/questions/2450954/how-to-randomize-shuffle-a-javascript-array) random sort approach they choose. – Mitya Apr 10 '20 at 11:08
  • The crucial point to this is that however you sort them, if the final two values are the same then you can just swap them for others, given that each identifier is unique to the user. If there is no duplication in all the other pairings, then it's not possible to cause further duplication by swapping a matched pair – Tom Davies Apr 10 '20 at 11:48
0

Shuffle the arrays first. Then sort them so that a[i] !== b[i].

  data.sort((a, b) => (users[ data.indexOf(a) ] !== a) - (users[ data.indexOf(b) ] !== b));

You might want to ensure that the arrays are different (e.g. sort, stringify, compare) before doing this.

You might also want to implement your own data.sort(...) function, that implements a sorting algorithm that guarantees to work in all cases (see discussion below).

Jonas Wilms
  • 132,000
  • 20
  • 149
  • 151
  • 1
    This *could* work but it does depend on the content of the arrays. However, I fear that sort might not actually ensure the inequality of the arrays - if `a` and `b` need to change positions, that will mandate the *entire rest of the array* to also be re-evaluated yet, I do not believe `sort` has any such guarantee. So, given `[1, 2, 3, 4]` and `[5, 3, 3, 6]` assuming the algorithm goes in sequence `5, 1` would be left, `2, 3` would be left, `3, 3` would then mean a swap is needed, so in the second array you'd swap index `1` and `2` but still have the same conflict. – VLAZ Apr 10 '20 at 10:26
  • @VLAZ yes, indeed. My compare function is not stable according to the spec, so the behaviour is implementation specific. The example you've given does work (in latest Chrome) though. Maybe one should manually implement a sorting algorithm that has this guarantee. – Jonas Wilms Apr 10 '20 at 11:44
  • Yes, the example was fictitious. It *might* work or not - depends on the implementation. However, different implementations might have different problems. Using a custom made sorting would solve that, as it would ensure you're in control. I think a simple bubble sort can probably deal with this. But then you'd have `O(n^2)` sorting. Again, depending on the dataset, that might work. – VLAZ Apr 10 '20 at 11:47
0

Here is a somewhat simple brute force algorithm.

First, generate all possible combinations of pairs between the two arrays. This so, with arrA and arrB both of length of four, the possible pairings are

------- 1 -------
(arrA[0], arrB[0])
(arrA[1], arrB[1])
(arrA[2], arrB[2])
(arrA[3], arrB[3])

------- 2 -------
(arrA[0], arrB[1])
(arrA[1], arrB[2])
(arrA[2], arrB[3])
(arrA[3], arrB[0])

------- 3 -------
(arrA[0], arrB[2])
(arrA[1], arrB[3])
(arrA[2], arrB[0])
(arrA[3], arrB[1])

------- 4 -------
(arrA[0], arrB[3])
(arrA[1], arrB[0])
(arrA[2], arrB[1])
(arrA[3], arrB[2])

These are the unique ones. A valid possible pairing is also

(arrA[3], arrB[3])
(arrA[0], arrB[0])
(arrA[2], arrB[2])
(arrA[1], arrB[1])

but it contains the same elements as the first combination we had, so we don't need it.

We can now eliminate any sequence where the members of the pair match. We'd be left with only the sequences that produce distinct pairs.

Finally, we just need to pick one of those and shuffle it - that will ensure that we could get one of the sequences we ignored first. That way, we only need to hold length amount of sequences otherwise if we generate every possible sequence, we get up to length * (length!) (! is factorial).

As for how to generate all unique sequences, we can do a couple of tricks. First, we generate the sequence where the index of arrA matches the index of arrB

arrA = [a, b, c, d] ---
        |  |  |  |    |
        |  |  |  |    |---> (a, w) (b, x) (c, y) (d, z)
        v  v  v  v    |
arrB = [w, x, y, z] ---

Then we rotate arrB by one element:

     -------------
     |            |
      > [w, x, y, z]<
         |           |
          ------------

and repeat:

arrA = [a, b, c, d] ---
        |  |  |  |    |
        |  |  |  |    |---> (a, x) (b, y) (c, z) (d, w)
        v  v  v  v    |
arrB = [x, y, z, w] ---

We do that length amount of times and we'd get the unique pairings.

As for the second trick, we don't need to actually change arrB - instead we can just keep an offset that increments after each sequence is generated. The first time we generate things the offset is 0 since we produce (arrA[0], arrB[0]) (arrA[1], arrB[1]) - equal indexes. The next time we need to produce (arrA[0], arrB[1]) (arrA[1], arrB[2]) so take the next index in arrB. We can clamp it within bounds using the remainder operator %: x % number is guaranteed to produce an integer in the range [0, number) - between 0 (inclusive) and number (exclusive). That gives us results similar to rotation but at the cost of a single numeric variable:

var offset = 2;
var length = 4;

for (var index = 0; index < length; index++) {
  var fakeRotatedIndex = (index + offset) % length;
  
  console.log(fakeRotatedIndex);
}

With all of this together we get a relatively cheap brute force algorithm that we can use.

The final step is shuffling and I'd use the solution from this answer with comments removed for brevity. It's a simple and effective Fisher-Yates shuffle implementation.

function crissCrossZip(arrA, arrB) {
  var all = [];
  
  //assuming equal length
  var totalLength = arrA.length;
  
  //produce all valid sequences
  for (var offset = 0; offset < totalLength; offset++) {
    var zip = arrA.map((x, index) => [x, arrB[(index + offset) % totalLength]]);

    //ignore sequence if there are any matches
    if (zip.every(([a, b]) => a !== b))
      all.push(zip);
  }

  //pick random sequence
  var chosenSequence = all[Math.floor(Math.random() * all.length)];

  shuffle(chosenSequence);
  
  return chosenSequence;
}

function shuffle(array) {
  var currentIndex = array.length, temporaryValue, randomIndex;

  while (0 !== currentIndex) {
    randomIndex = Math.floor(Math.random() * currentIndex);
    currentIndex -= 1;

    temporaryValue = array[currentIndex];
    array[currentIndex] = array[randomIndex];
    array[randomIndex] = temporaryValue;
  }

  return array;
}


//// usage ////

var users = [1,2,3,4]
var data = [1,2,3,4]

var result = crissCrossZip(users, data);

console.log(result.map(x => JSON.stringify(x)))

Handling arrays with different amounts of elements

This can easily be modified to handle arrays of different length, if needed. This time we'd have to do the same but ensure that arrA is the *shorterarray andarrBis the longer one. We'd also have to usemaxLength` instead of the common length in order to exhaust all unique valid combinations:

arrA = [a, b, c, d] ------
        |  |  |  |       |
        |  |  |  |       |---> (a, v) (b, w) (c, x) (d, y)
        v  v  v  v       |
arrB = [v, w, x, y, z] ---

// "rotate" 

arrA = [a, b, c, d] ------
        |  |  |  |       |
        |  |  |  |       |---> (a, w) (b, x) (c, y) (d, z)
        v  v  v  v       |
arrB = [w, x, y, z, v] ---

// etc

Also, if we swapped arrA and arrB at the start (because arrA was longer) we need to swap the pairs at the end. Other than that, we have everything we need.

function crissCrossZip(arrA, arrB) {
  var all = [];
  
  var [shorter, longer] = [arrA, arrB].sort((a, b) => a.length - b.length);
  
  var adjust = x => x;
  if (shorter !== arrA) 
    adjust = x => x.map(y => y.reverse());
    
  var maxLength = longer.length;
  
  //produce all valid sequences
  for (var offset = 0; offset < maxLength; offset++) {
    var zip = shorter.map((x, index) => [x, longer[(index + offset) % maxLength]]);

    //ignore sequence if there are any matches
    if (zip.every(([a, b]) => a !== b))
      all.push(zip);
  }
  
  //pick random sequence
  var chosenSequence = all[Math.floor(Math.random() * all.length)];

  shuffle(chosenSequence);
  
  return adjust(chosenSequence);
}

function shuffle(array) {
  var currentIndex = array.length, temporaryValue, randomIndex;

  while (0 !== currentIndex) {
    randomIndex = Math.floor(Math.random() * currentIndex);
    currentIndex -= 1;

    temporaryValue = array[currentIndex];
    array[currentIndex] = array[randomIndex];
    array[randomIndex] = temporaryValue;
  }

  return array;
}

document.getElementById("click_me").addEventListener("click", function() {
  var users = extractValues("users");
  var data = extractValues("data");

  var result = crissCrossZip(users, data);
  
  var div = document.createElement("div");
  div.classList.add("result");
  div.textContent = result.map(x => `(${x.join(", ")})`).join(", ");
  
  document.getElementById("results").prepend(div);
})

function extractValues(id) {
  return document.getElementById(id)
    .value
    .split(",")
    .map(x => x.trim());
}
body {
  background-color: white;
}

#results {
  background-color: #ddd;
}

.result:nth-child(1) {
  background-color: #ff0;
  padding-left: 0;
}
<p>Enter comma separated values:</p>
<label for="users">users:</label> <input type="text" id="users" value="1, 2, 3, 4, 5"/>
<br/>
<label for="data">data:</label> <input type="text" id="data" value="1, 2, 3, 4"/>
<br/>
<button id="click_me">Produce pairs</button>

<div>Results (latest first):</div>
<div id="results"></div>

Optimisation paths

Only generate pairs without duplicates

The first optimisation is very cheap and relatively minor but still something that helps. When generating the zip array we can completely discard it if any of the members are duplicated, we don't need to wait until after it is generated.

//produce all valid sequences
for (var offset = 0; offset < maxLength; offset++) {
  var keep = true;

  for (var index = 0; index < shorter.length; index++) {
    var x = shorter[index];
    var pair = [x, longer[(index + offset) % maxLength]];

     if (pair[0] === pair[1]) {
       keep = false;
       break;
     }
  }

  //ignore sequence if there are any matches
  if (keep)
    all.push(zip);
}

Only keep the array you want to return

instead of holding up to maxLength amount of arrays and picking later, we can pick first and only generate up to the choice. We'd have to shift the choice as we go along, if we discard any sub-results. So, the sequence is this:

  1. Pick which sequence would be produced. We know what the possible amount of sequences is - it's maxLength, so we need a pick between 0 and maxLength. Let's call that choiceIndex.
  2. Start generating the arrays, keep the index of what the current array woud be. This happens to be offset in the loop already.
    • only keep two arrays - the last good one and the current.
    • if we get to offset === choiceIndex and the current array is not discarded - stop. No need to generate more. Just return the current array.
    • if we get to offset === choiceIndex and the current array is discarded - return the last good one.
    • if we get to offset === choiceIndex we neither have a last good array nor is the current array - continue until a good array is generated.

This way we only ever keep up to two arrays of pairs in memory, as opposed to all unique ones.

function crissCrossZip(arrA, arrB) {
  var [shorter, longer] = [arrA, arrB].sort((a, b) => a.length - b.length);
  
  var adjust = x => x;
  if (shorter !== arrA) 
    adjust = x => x.map(y => y.reverse());
    
  var maxLength = longer.length;
  
  var chosenIndex = Math.floor(Math.random() * maxLength);

  //produce valid sequences
  var lastGoodArray, currentArray;
  for (var offset = 0; offset < maxLength; offset++) {
    var keep = true;
    var currentArray = [];

    for (var index = 0; index < shorter.length; index++) {
      var x = shorter[index];
      var pair = [x, longer[(index + offset) % maxLength]];

       if (pair[0] === pair[1]) {
         keep = false;
         break;
       }
       currentArray.push(pair);
    }

    if (keep)
      lastGoodArray = currentArray;
      
    if (offset >= chosenIndex && lastGoodArray)
      break;
  }
  
  if (!lastGoodArray)
    return;
    
  var chosenSequence = lastGoodArray;

  shuffle(chosenSequence);
  
  return adjust(chosenSequence);
}

function shuffle(array) {
  var currentIndex = array.length, temporaryValue, randomIndex;

  while (0 !== currentIndex) {
    randomIndex = Math.floor(Math.random() * currentIndex);
    currentIndex -= 1;

    temporaryValue = array[currentIndex];
    array[currentIndex] = array[randomIndex];
    array[randomIndex] = temporaryValue;
  }

  return array;
}

document.getElementById("click_me").addEventListener("click", function() {
  var users = extractValues("users");
  var data = extractValues("data");

  var result = crissCrossZip(users, data);
  
  var div = document.createElement("div");
  div.classList.add("result");
  div.textContent = result.map(x => `(${x.join(", ")})`).join(", ");
  
  document.getElementById("results").prepend(div);
})

function extractValues(id) {
  return document.getElementById(id)
    .value
    .split(",")
    .map(x => x.trim());
}
body {
  background-color: white;
}

#results {
  background-color: #ddd;
}

.result:nth-child(1) {
  background-color: #ff0;
  padding-left: 0;
}
<p>Enter comma separated values:</p>
<label for="users">users:</label> <input type="text" id="users" value="1, 2, 3, 4, 5"/>
<br/>
<label for="data">data:</label> <input type="text" id="data" value="1, 2, 3, 4"/>
<br/>
<button id="click_me">Produce pairs</button>

<div>Results (latest first):</div>
<div id="results"></div>
VLAZ
  • 26,331
  • 9
  • 49
  • 67
-1

Assuming both arrays are the same, just loop through the array and swap the current position for a random one selected from the next one until the end.

I'm pretty sure it's more efficent than the rest of the answers. Something like:

function randomInteger(min, max) {
 return Math.floor(Math.random() * (max - min + 1)) + min;
}
for (let i = 0; i < data.length - 1; i++) {
  let aux = data[i]
  let rand = randomInteger(i+1, data.length)
  data[i] = data[rand]
  data[rand] = aux
}
E_net4
  • 27,810
  • 13
  • 101
  • 139
jorbuedo
  • 2,041
  • 1
  • 8
  • 20
-2

Use flatMap

 var pairs = users.flatMap(u => data.map(d => [u, d]));

Ok then:

 var distincts = pairs.filter(s => s[0] !== s[1]);

Another edit: You might shuffle afterwards.

Last edit:

var pairs = users.flatMap(u => data.map(d => [u, d])).filter(s => s[0] !== s[1]);
atmin
  • 370
  • 1
  • 7
  • This is in no way guaranteed to produce distinct pairs. In fact, with the given input it's guaranteed to produce pairs of matching values. Moreover, it's producing pairs that shouldn't be there, as it does all possible combinations, doesn't to a mis-matched zip. – VLAZ Apr 10 '20 at 11:20
  • `distincts` is still just producing *valid* pairings but doesn't in any way ensure that is a valid zip result. – VLAZ Apr 10 '20 at 11:27
  • The last edit *still* produces too many pairs. OP wants a *zip* of the two arrays - new array where each element is a pair coming from each. The result will *always* be equal to the length of the arrays, assuming equal length. So, with two arrays `a` and `b` with four elements each, the result must be a new array with four pairs. A zip is basically `a.map((x, i) => [x, b[i]])` but the challenge here is making sure the pairs are distinct and that the result is randomised. So you might end up producing `[ [a[3], b[0]], [a[0], b[1]], [a[2], b[3]], [a[1], b[2]] ]` assuming the pairs are distinct. – VLAZ Apr 10 '20 at 12:08
  • Ouh! Got it now. Sorry. – atmin Apr 10 '20 at 15:47