So I want to preface this answer by a disclaimer: I am not a mathematician; my understanding of linear algebra, matrix algebra and statistic is adequate but by no means extensive. There may be ways to achieve the same results with fewer lines of code or more efficiently. However, I believe that the verbosity level of this answer will allow more people to understand it, and follow the logic step by step. Now that that's out of the way, let's jump into it.
Calculate the Weight of Each Pair
The problem with the current approach is that the logic for finding the best possible match is happening as you loop through the results of the query. What that means is that you may be left with elements that can't be matched together when you get to the last iteration. To fix this, we're going to need to split the process a bit. The first step is to get all of the possible pairs for the given array elements.
$elements = [
1,2,3,4,5,6
];
$possiblePairs = [];
for($i = 1; $i <= $elementCount; $i++) {
for($j = $i + 1; $j <= $elementCount; $j++) {
$possiblePairs[] = [
'values' => [$elements[$i - 1], $elements[$j - 1]],
'score' => rand(1, 100)
];
}
}
As you can see, for each possible pair, I am also attaching a score
element, a random integer in this case. This score represents how strongly this pair matches. The value of this score
doesn't actually matter: it could be values in a specific range (ex: 0-100) or every value defined by some functions of your own. The important thing here is that pairs that have highest match potential should have a higher score, and pairs with little to no potential should have a low score. If some pairs absolutely cannot go together, simply set the score
to zero.
The next step is to then use than score
value to sort the array so the the pairs that are more strongly matched are on top, and the weaker (or impossible) pairs at the bottom. I used PHP 7.0's spaceship operator to get the job done, but you can find other ways to achieve this sort here.
usort($possiblePairs, function($a, $b) {
return $b['score'] <=> $a['score'];
});
Now we're finally equipped to build our final output. This step is actually fairly easy: loop through the possible pairs, check if the values haven't already been used, then push them to the output array.
$used = []; // I used a temporary array to store the processed items for simplicity
foreach($possiblePairs as $key => $pair) {
if($pair['score'] !== 0 && // additional safety if two elements cannot go together
!in_array($pair['values'][0], $used) &&
!in_array($pair['values'][1], $used))
{
$output[] = $pair['values']; // push the values to the return of the function
array_push($used, $pair['values'][0], $pair['values'][1]); // add the element to $used so they get ignored on the next iteration
}
}
var_dump($output);
// output
array(3) {
[0]=> array(2) {
[0]=> int(2)
[1]=> int(4)
}
[1]=> array(2) {
[0]=> int(1)
[1]=> int(3)
}
[2]=> array(2) {
[0]=> int(5)
[1]=> int(6)
}
}
And there it is! The strongest pair is chosen first, and then it goes down in priority. Play around with weighting algorithms, see what works for your specific needs. As a final word: if I were writing this in the context of a business application, I would probably add some error reporting if there happens to be un-matcheable pairs (after all, it is statistically possible) to make it easier to spot the cause of the problem and decide if the weighting needs tweaking.
You can try the example here
Edit to add:
In order to prevent the scenario where an element gets stored in a pair when another element can only be paired with that first element (which results in the pair never being created) you can try this little patch. I started by recreating your requirements in a getScore()
function.
function getScore($elem1, $elem2) {
$score;
if($elem1 > $elem2) {
$tmp = $elem1;
$elem1 = $elem2;
$elem2 = $tmp;
}
if($elem1 === 1 && $elem2 === 2) {
$score = 100;
}
elseif(($elem1 === 3 && $elem2 !== 2) || ($elem1 !== 2 && $elem2 === 3)) {
$score = 0;
}
else {
$score = rand(0, 100);
}
return $score;
}
Then, I modified the $possiblePairs
array creation to do two additional things.
- if the score is 0, don't append to the array and
keep track of how many matches were found for each element. By doing this, any element that only has one possible match will have an associated value in $nbMatches
of 1
.
$nbMatches = [
1 => 0,
2 => 0,
3 => 0,
4 => 0
]
$possiblePairs = [];
for($i = 1; $i <= $elementCount; $i++) {
for($j = $i + 1; $j <= $elementCount; $j++) {
$score = getScore($elements[$i - 1], $elements[$j - 1]);
if($score > 0) {
$possiblePairs[] = [
'values' => [$elements[$i - 1], $elements[$j - 1]],
'score' => $score
];
$nbMatches[$elements[$i - 1]]++;
$nbMatches[$elements[$j - 1]]++;
}
}
}
Then I added another loop that will bump up those elements so that they end up on top of the list to be processed before the rest.
foreach($nbMatches as $elem => $intMatches) {
if($intMatches === 1) {
foreach($possiblePairs as $key => $pair) {
if(in_array($elem, $pair['values'])) {
$possiblePairs[$key]['score'] = 101;
}
}
}
}
usort($possiblePairs, function($a, $b) {
return $b['score'] <=> $a['score'];
});
The output is then:
array(2) {
[0]=> array(2) {
[0]=> int(2)
[1]=> int(3)
}
[1]=> array(2) {
[0]=> int(1)
[1]=> int(4)
}
}
I just want to stress: this is only a temporary patch. It will not protect you in cases where, for instance, element 1
can only match element 2
and element 3
can also only match element 2
. However, we are dealing with a very small sample size, and the more elements you have, the less likely these edge case will be likely to occur. In my opinion this fix is not necessary unless you are only working with 4 elements. working with 6 elements yields 15 possible combinations, 10 elements yields 45 possible combinations. Already those cases will be unlikely to happen.
Also, if you find that those edge cases still happen, it may be a good idea to go back and tweak the matching algorithm to be more flexible, or take into account more parameters.
You can try the updated version here