0

I have an array of numbers that represent number of times some processing needs to be done some entities:

array = [20, 30, 10, 7, 8, 5]

and a number that represents the total number of times processing was actually done:

amount_processed = 80

I'd like to build a hash whose keys are the numbers in the array and whose values represent the number of times that was successfully processed of the 80. For instance:

hash = {}
index = 0
until amount_processed <= 0 || index == array.count
  mapped_amount = [array[index],amount_processed].min
  hash[array[index]] = mapped_amount
  amount_processed -= mapped_amount
  index += 1
end

The output in this case would be:

{20 => 20, 30 => 30, 10 => 10, 7 => 7, 8 => 8, 5 => 5}

If the amount_processed = 65, I would get:

{20 => 20, 30 => 30, 10 => 10, 7 => 5}

I want to map the amount_processed such that it always preferences a mapping where all of the given keys are used up. For example, for an amount_processed = 65, the output should be:

{20 => 20, 30 => 30, 10 => 10, 5 => 5} # skipped 7 and 8 entirely

Where there are different possible outputs, either is valid, I'm indifferent. I.e., if amount_processed = 60, EITHER of the 2 below would be valid

{20 => 20, 30 => 30, 10 => 10}
{30 => 30, 10 => 10, 7 => 7, 8 => 8, 5 => 5}

My code to successfully achieve the above outcomes is below

hash = {}
index = 0
size = array.size
until amount_processed <= 0
    if index == size * 2
        hash["notes"] = "attempting to go beyond 2nd iteration, i.e., there is still amount_processed left but nothing more in the array to map it to"
        return hash
    elsif index >= size
    # we've already looped through everything to find something that fully matches, and still have leftover amounts to be mapped, so now start over, and just go in order to fill in whatever's available
    pseudo_index = index - size # allows us to map to original array once index is on second iteration
    # was that particular one skipped or not
    if hash[array[pseudo_index]].present?
        # it wasn't skipped in earlier go around, so skip it NOW
    else
        mapped_amount = [array[pseudo_index],amount_processed].min
        hash[array[pseudo_index]] = mapped_amount
        amount_processed -= mapped_amount
      end
  else
    if amount_processed < array[index]
      # we don't want a partial map, so just don't, unless it's last resort
    else
      mapped_amount = [array[index],amount_processed].min
      hash[array[index]] = mapped_amount
      amount_processed -= mapped_amount
    end
  end
  index += 1
end
return hash

Will my code always work, for any array/amount_processed combo to create a hash that first matches as many arrays that are "fully complete" as and then creates matches that are not fully complete?

james
  • 3,989
  • 8
  • 47
  • 102
  • You say, "...such that it always preferences a mapping where all of a given key is "used up." What do you mean by "preferences"? If there is no mapping where all keys used are "used up", then what? You really have two questions. The first is to determine if there is such a "used up" mapping, and if so, what it is. The second is what to do if there is no "used up" mapping. I suggest you limit yourself to the first question, but wait, that was answered [here](https://stackoverflow.com/questions/4632322/finding-all-possible-combinations-of-numbers-to-reach-a-given-sum). – Cary Swoveland Apr 23 '18 at 05:24

4 Answers4

1

Simple Answer: Your code will failure on some cases


Take amount_processed = 62 as an example. A proper solution is {20, 30, 7, 5}, but your code will return {20, 30, 10}

The cause is your original code part will pick the array value if it is smaller then the amount_processed variables and results in picking unnecessary array value.


Suggestion: Rethink your logic for the sub-set sum problem instead of reuse the original greedy algorithm

hk6279
  • 1,881
  • 2
  • 22
  • 32
  • oops you're right it won't work for 62, I forgot to check presence. have modified the code. i think it's what will work for my circumstances now, honestly i'm not great at algorithms so it's hard for me to develop something else – james Apr 23 '18 at 19:29
0

As @iGian pointed out in his answer its possible to find a solution by going trough all the combinations, but sometimes its possible to try to guesstimate the size of the resulting optimal array, so I created the following code which tries to do that in a very naive way:

@amount_processed = 62
@iterations = 0

def found_solution given_array
  puts "Found a solution"
  puts given_array.inspect
  puts given_array.sum == @amount_processed ? "Exact solution" : "Aproximate solution"
  puts "After #{@iterations} iterations"
  exit 0
end

hash = {}
solution = []
array = [20, 30, 10, 7, 8, 5]

found_solution(array) if array.sum <= @amount_processed
found_solution([array.min]) if array.min >= @amount_processed

min_size_array = 0
array.size.times do |i|
  @iterations += 1
  min_size_array = i
  sum_of_maxs = array.max(min_size_array).sum
  found_solution(array.max(min_size_array)) if sum_of_maxs == @amount_processed
  break if array.max(min_size_array).sum > @amount_processed
end

max_value_array = array.max
max_size_array = min_size_array
(min_size_array..array.size).each do |i|
  @iterations += 1
  sum_of_mins = array.min(i).sum
  found_solution(array.min(i)) if sum_of_mins == @amount_processed
  break if sum_of_mins > @amount_processed
  max_size_array = i
end

max_value_in_array = array.max
# Run through all the combinations within the given sizes until a solution is found, including iterations for 
# non exact solutions, which can at most be as much as 'max_value_in_array' away from the amount_processed
(0..max_value_in_array).each do |difference_to_solution|
  (min_size_array..max_size_array).each do |size|
    array.combination(size).each do |current_combination|
      @iterations += 1
      if current_combination.sum == (@amount_processed - difference_to_solution)
        found_solution(current_combination) unless difference_to_solution > 0
        # There is a difference, so we need to find the min_value that is not in the current_combination
        # and add it to the approximate solution
        duplicate_array = array.dup
        current_combination.each do |x|
          duplicate_array.slice!(duplicate_array.index(x))
        end
        min_value_not_in_current_combination = duplicate_array.min
        found_solution(current_combination + [min_value_not_in_current_combination])
      end
    end
  end
end
elyalvarado
  • 1,226
  • 8
  • 13
  • this guess is really interesting I will dig deeper. I tried some values and it breaks with `@amount_processed = 50`. I did not check out why yet. – iGian Apr 23 '18 at 06:18
  • Hey @iGian, that's right it breaks in some cases. I modified the script and tested with all values. It also now includes an additional value for non exact solution that minimizes the remaining value in the sum vs the amount processed – elyalvarado Apr 23 '18 at 06:24
  • wow! ok to be totally honest this solution goes above my head. I think I found a workable solution and have updated my answer. will need some time to sit down and review your code in detail, because it's definitely much more of a solution than the hack that i put together! – james Apr 23 '18 at 19:30
  • 1
    @james it only really make sense to start removing options from the combination when the array is big and the `amount_processed` requires also a large number of constituents. But in general, your problem is an NP problem which has no known algorithmic optimal solution, so the best you can do is run combinatorial code and try to see if a solution was found. – elyalvarado Apr 23 '18 at 20:03
  • what does NP stand for? thanks for all your insights! – james Apr 23 '18 at 20:08
  • 1
    NP stands for "nondeterministic polynomial time". You can check the [wikipedia article for P vs NP](https://en.wikipedia.org/wiki/P_versus_NP_problem) which will give you a better (and more correct) explanation that anything I can try to explain you in a comment. If you like my answer please up-vote. – elyalvarado Apr 23 '18 at 22:20
0

Your problem has some similarities to a Knapsack problem, and as others have stated it may not have an optimal algorithmic solution. Your "fit" function has some extra requirements that seem to make it even more challenging. Processing the array elements in varying orders may produce more optimal solutions so I think you kind of have to try them all and pick the one that best meets your criteria.

My solution tries all the permutations to brute force the best solution. It takes advantage of Ruby's hash identity (e.g. { 1=>2, 2=>3 } is equal to { 2=>3, 1=>2 }) to remove duplicate candidate solutions. I tried to align the fit function to your requirements.

def solve(*args)
  best_fit(compute_solutions(*args))
end

def best_fit(solutions)
  # prefer unique solutions with no partial values
  preferred_solutions, alternative_solutions =
    solutions.uniq.partition { |solution| solution.all? { |k, v| k == v } }

  if preferred_solutions.empty?
    # prefer unique solutions with "fullest" partial values
    preferred_solutions = alternative_solutions
      .group_by { |solution| solution.detect { |k, v| v < k and break k } }
      .min_by(&:first)
      .last
  end

  # prefer shortest solution matching above criteria
  preferred_solutions
    .group_by(&:length)
    .min_by(&:first)
    .last
    .first
end

def compute_solutions(array, *args)
  return enum_for(__callee__, array, *args).to_a unless block_given?

  array.permutation do |permuted_array|
    yield compute_solution(permuted_array, *args)
  end
end

def compute_solution(array, amount_processed)
  return enum_for(__callee__, array, amount_processed).to_h unless block_given?

  array.each do |amount|
    break if amount_processed.zero?
    value = [amount, amount_processed].min
    yield [amount, value]
    amount_processed -= value
  end
end

p solve([20, 30, 10, 7, 8, 5], 80) == { 20=>20, 30=>30, 10=>10, 7=>7, 8=>8, 5=>5 } # => true
p solve([20, 30, 10, 7, 8, 5], 79) == { 20=>20, 30=>30, 10=>10, 7=>7, 8=>8, 5=>4 } # => true
p solve([20, 30, 10, 7, 8, 5], 65) == { 20=>20, 30=>30, 10=>10, 5=>5 } # => true
p solve([20, 30, 10, 7, 8, 5], 62) == { 20=>20, 30=>30, 7=>7, 5=>5 } # => true
p solve([20, 30, 10, 7, 8, 5], 60) == { 20=>20, 30=>30, 10=>10 } # => true
mwp
  • 8,217
  • 20
  • 26
  • Very smart! I played with your code and I had to stopped the execution trying `p solve([20, 30, 10, 7, 8, 5, 20, 30, 10, 7, 8, 5], 100)`, as the array grows it takes more time. Of course. – iGian Apr 24 '18 at 12:08
  • @iGian Wow, yeah, it takes forever just to generate all the permutations. – mwp Apr 24 '18 at 18:42
-1

Maybe this is the logic you are looking for? It goes trough all combinations if there is no match and returns an empty array

array = [20, 30, 10, 7, 8, 5]
amount_processed = 65

go = true
(array.size + 1).times do |n|
    combinations = array.combination(n).to_a
    combinations.each do |combination|
        sum = combination.inject(0){|sum,x| sum + x }
        if sum == amount_processed
            then
                p combination
                go = false
        end
        break if go == false
    end
    break if go == false
end
iGian
  • 11,023
  • 3
  • 21
  • 36
  • thanks! this is a great solution and taught me about `combination` method on array. however it doesn't work for a situation where amount_processed does NOT add up fully to any combination of the array and a partial amount must be used, also I'd like to find a solution that avoids going through every possible combination, since i worry about efficiency for much larger arrays – james Apr 23 '18 at 18:58
  • Edited, for an error breaking loops. I don't know how big can be your array, but, you can always choose to stop earlier, for example when `(sum - amount_processed).abs <= delta`. My code is just a suggestion, not a full solution. – iGian Apr 24 '18 at 12:16