2

I have an array of numbers (int or float) and I need to find a value by combining array values. Once the smallest possible combination is found the function returns the array values. Therefore I start with sample-size=1 and keep incrementing it.

Here's a simplified example of the given data:

$values = [10, 20, 30, 40, 50];
$lookingFor = 80;

Valid outcomes:

[30, 50] // return this
[10, 20, 50], [10, 30, 40] // just to demonstrate the possible combinations

Permutations solve this problem and I've tried many different implementations (for example: Permutations - all possible sets of numbers, Get all permutations of a PHP array?, https://github.com/drupol/phpermutations). My favourite is this one with a parameter for permutation-size using the Generator pattern: https://stackoverflow.com/a/43307800

What's my problem? Performance! My arrays have 5 - 150 numbers and sometimes the sum of 30 array numbers is needed to find the searched value. Sometimes the value can't be found, which means I needed to try all possible combinations. Basically with permutation-size > 5 the task becomes too time consuming.

An alternative, yet not precise way is to sort the array, take the first X and last X numbers and compare with the searched value. Like this:

sort($values, SORT_NUMERIC);
$countValues = count($values);

if ($sampleSize > $countValues)
{
    $sampleSize = $countValues;
}

$minValues = array_slice($values, 0, $sampleSize);
$maxValues = array_slice($values, $countValues - $sampleSize, $sampleSize);

$possibleMin = array_sum($minValues);
$possibleMax = array_sum($maxValues);

if ($possibleMin === $lookingFor)
{
    return $minValues;
}

if ($possibleMax === $lookingFor)
{
    return $maxValues;
}

return [];

Hopefully somebody has dealt with a similar problem and can guide me in the right direction. Thank you!

  • In regards to *"An alternative, yet not precise way is to sort the array*", is this the code you are actually using? And if not, could you post your own code attempt? – GetSet Mar 05 '22 at 22:32
  • It's the code I'm currently using. – weaverslave Mar 05 '22 at 22:37
  • Can you use a database? There would be some major perfomance boost using index on values. – Kviz Majster Mar 05 '22 at 22:40
  • Yes it's possible to use a database. How would that work? Please give me a hint – weaverslave Mar 05 '22 at 22:47
  • Its an interesting problem but seems like any answer will defeat the purpose of you coming up with an algorithm yourself to solve it. It will also be opinionated because many of such algorithms already exist, as you discovered in your research. – GetSet Mar 05 '22 at 22:47
  • @GetSet yes I've tried to solve it without success so far. I appreciate your input though – weaverslave Mar 05 '22 at 22:53
  • Look into "combinations" (as opposed to permutations) since the former involves less possibilities and thus will have a performance boost during the "recursion", though true recursion is not necessary to solve this. E.g. `[30, 50]` is the same combination as `[50, 30]` but different permutations. – GetSet Mar 05 '22 at 23:02
  • I mean, what if one could precalculate all combinations into a temp table/verbatim table using any well known algoritm and then just simply select desired values using SQL. Searching even among millions of values using database index is lightning fast in compare with arrays. – Kviz Majster Mar 05 '22 at 23:03
  • The problem is @KvizMajster that would take many many many years to generate – GetSet Mar 05 '22 at 23:05
  • @GetSet I'm looking into combinations now. Will try to make it work. I'll let you guys know – weaverslave Mar 05 '22 at 23:12
  • @KvizMajster I think it's not an option in this case. You would need to precalculate all combinations basically from 1 to 99999999 which will take forever and then there is still the task of finding the best combination – weaverslave Mar 05 '22 at 23:16

1 Answers1

0
  • you must use combination instead of permutations {ex: P(15) = 130767436800 vs C(15) = 32768}

  • if array_sum < target_number then no solution exists

  • if in_array(target_number, numbers) solution found with 1 element

  • sort lowest to highest

  • start with C(n,2) where 2 represents 1st 2nd then 1st 3rd etc (static one is 1st element)

  • if above loop found no solution continue with 2nd 3rd then 2nd 4th, etc)

  • if C(n,2) had no solution then jump to C(n,3)s but this time 2 static numbers and 1 dynamic one

  • if loop ended with no solution then there exists no solution

lastly, I would adjust this question and ask in statistics branch of stack exchange (crossvalidated) since mean, median and cumulative distribution of the sums of the numbers may hint to decrease the number of iterations significantly and this is their profession.

Ersin
  • 124
  • 3