Scipy Optimize Minimize For Combinatorics - Integer only solutions

Question

I have the below code that tries to assign the best product (a or b) to each factory (1,2,3,4 or 5).
How do I set constraints/bounds so that it only uses the integer values 0 or 1? e.g I have set x0 = [1,0,1,1,0] and the solution for this sample data is [0,0,0,0,0] Also, Is this even this correct approach to solve this problem? I have also tried pulp, but it does not seem to support the dynamic emission charges, as these are calculated in aggregate not at a factory level.

import numpy as np
from scipy.optimize import minimize

data = {
    '1': (1.2,),
    '2': (1.0,),
    '3': (1.7,),
    '4': (1.8,),
    '5': (1.6,)
}

unit_cost_b = 8
limit_b = 100
echarge_b = 0.004

unit_cost_a = 5
limit_a = 60
echarge_a = 0.007


def objective(x):
    x_a = [data[str(i+1)][0] for i in range(len(data)) if x[i] == 0]
    x_b = [data[str(i+1)][0] for i in range(len(data)) if x[i] == 1]

    total_unit_cost = (len(x_a) * unit_cost_a) + (len(x_b) * unit_cost_b)
    total_usage_a = np.sum(x_a)  
    total_usage_b = np.sum(x_b)
    
    emission_cost_a = max(total_usage_a - (len(x_a) * limit_a), 0) * echarge_a * 31
    emission_cost_b = max(total_usage_b - (len(x_b) * limit_b), 0) * echarge_b * 31

    return total_unit_cost + emission_cost_a + emission_cost_b

bounds = [(0, 1)] * len(data) 
eq_cons = {'type': 'eq', 'fun': lambda x: np.sum(x) - 5}


x0 = [1,0,1,1,0]
print(x0)
result = minimize(objective, x0, method='SLSQP', bounds=bounds, constraints=[eq_cons])

You should also look into using `gekko`. I've had some success with that package's optimization routines. The [developer](https://stackoverflow.com/users/2366941/john-hedengren) is also very responsive on SO. — jared, Jun 18 '23 at 23:44
Also look into using [`scipy.optimize.milp`](https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.milp.html). — jared, Jun 18 '23 at 23:48
Short answer: you are not setting this up correctly. If you desire to use `scipy` you need to provide a matrix formulation and follow the dox on using the `milp` package or set up the linprog differently. Here are some refs: https://stackoverflow.com/questions/71329868/how-to-define-conditional-constraints-in-optimization-utilizing-scipy-minimize https://stackoverflow.com/a/73502509/10789207 — AirSquid, Jun 19 '23 at 00:03
I think this should snap right in with `pulp` and binary indicator variables. Realize that you cannot use conditional statements and `max` formulations. Those are non-linear expressions — AirSquid, Jun 19 '23 at 00:04
@AirSquid, I tried to formulate this in pulp (I know it is incorrect): `problem += (unit_cost_a * x) + (unit_cost_b * y)` `problem += max(sum(data[id][0] for id in data),0) - ((x * limit_a ) * echarge_a * 31) ` `problem += max(sum(data[id][0] for id in data),0) - ((y * limit_b ) * echarge_b* 31) ` `problem += x + y == len(data)` It seems pulp will minimise the 2nd constraint, and then check it satisfies the 3rd, but won't minimise them both and take the optimal, reordering 2nd and 3rd you get either: `-18.507*y + 7.6023000000000005` or `-11.997*x + 7.6023000000000005` — SomeGuy30145, Jun 19 '23 at 01:01
Yeah, it appears you are still rolling `max` and `if` statements into that which isn't going to work as those are non-linear. — AirSquid, Jun 19 '23 at 02:13
It's not real clear what the coefficients in the `data` dictionary are and it seems that both the emissions costs will be zero based on the data... ? Perhaps this is just a shell of a larger project? — AirSquid, Jun 19 '23 at 02:17
@airsquid, this is just the toy model, the main dataset is much much larger How might you handle the non-linear constraints in Pulp? — SomeGuy30145, Jun 19 '23 at 02:21
2 q's: what does the data in data represent? Do you have more than 2 products? — AirSquid, Jun 19 '23 at 02:24
They're machines, just machine 1 to 5 in this case. The number is emissions estimate, I can allocate each machine to one of 2 'products' which are really just pricing schemes. — SomeGuy30145, Jun 19 '23 at 02:27
Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/254139/discussion-between-airsquid-and-someguy30145). — AirSquid, Jun 19 '23 at 02:40

score 2 · Answer 1 · answered Jun 19 '23 at 02:47

This is a pulp implementation below. You could do this in scipy using the scipy.optimize.MILP pkg, but I'm pretty sure you would need to develop the problem in matrix format, which is doable, but packages such as pulp or pyomo do it for you and I think they are much more approachable.

Realize, you cannot use if or max() in problem formulation as those things are nonlinear and the value of the variables is not known when the problem is handed over to a solver. And the bulk of what pulp is doing is making the problem to hand over....

On your model, the emissions data is odd and seems to always be zero because of the large values of limit, but perhaps it makes sense in larger context or you can adjust from the below.

CODE:

# factory picker

import pulp

data = {
    'f1': (1.2,),
    'f2': (1.0,),
    'f3': (1.7,),
    'f4': (1.8,),
    'f5': (1.6,)
}

# unit_cost_b = 8
# limit_b = 100
# echarge_b = 0.004

# unit_cost_a = 5
# limit_a = 60
# echarge_a = 0.007

unit_cost = {'a': 5, 'b': 8}
limit = {'a': 60, 'b': 100}
echarge = {'a': 0.007, 'b': 0.004}

# conveniences....
factories = list(data.keys())
products = ['a', 'b']
fp = [(f, p) for f in factories for p in products]

### PROB SETUP
prob = pulp.LpProblem('factory_assignments', pulp.LpMinimize)

### VARS

# make[factory, product] == 1 if making that product, else 0...
make = pulp.LpVariable.dicts('make', fp, cat=pulp.LpBinary)

emission = pulp.LpVariable.dicts('emission', products, lowBound=0)

### OBJECTIVE

# a convenience expression...
tot_unit_cost = sum(unit_cost[p] * make[f, p] for f, p in fp)

prob += tot_unit_cost + sum(emission[p] for p in products)


### CONSTRAINTS

# constrain the emission to GTE the expression.  It already has a lowBound=0, so that
# handles the max() expression in a linear expression
for p in products:
    prob += emission[p] >= sum(make[f, p] * (data[f][0] - limit[p]) for f in factories) * echarge[p] * 31

# each factory must produce *something*  (and only 1 thing)  (this is implied in the orig problem)
for f in factories:
    prob += sum(make[f, p] for p in products) == 1

### QA it...
print(prob)

### SOLVE
soln = prob.solve()

for f, p in fp:
    if make[f, p].varValue:   # True if the variable is 1
        print(f'make {p} in factory {f}')

print(f'tot unit cost: {pulp.value(tot_unit_cost)}')

for p in products:
    print(f'emission cost for {p} is: {emission[p].varValue}')

OUTPUT (truncated):

Result - Optimal solution found

Objective value:                25.00000000
Enumerated nodes:               0
Total iterations:               0
Time (CPU seconds):             0.00
Time (Wallclock seconds):       0.00

Option for printingOptions changed from normal to all
Total time (CPU seconds):       0.00   (Wallclock seconds):       0.00

make a in factory f1
make a in factory f2
make a in factory f3
make a in factory f4
make a in factory f5
tot unit cost: 25.0
emission cost for a is: 0.0
emission cost for b is: 0.0

[Finished in 100ms]

Your solution is great - thank you. Although when I extend it out to 1000s, or even 100s of factories, the number of solutions makes it unworkable - I've been investigating different solvers but that doesn't seem to help - suggestions? — SomeGuy30145, Jul 03 '23 at 03:38
I'm not sure what you mean by "the number of solutions". Can you clarify a bit? — AirSquid, Jul 03 '23 at 04:47
e.g if I have 41 factories, there are 2.2x10^12 possible solutions (2^41) so after a certain point the problem becomes computationally very intense. — SomeGuy30145, Jul 03 '23 at 21:36
Not really, that is the solver's job. For this size problem, it should solve almost instantly for 100 factories. Are you seeing something different? OBTW: I'm not sure if I captured the concept of your `limit` in my example, perhaps a secondary issue — AirSquid, Jul 03 '23 at 22:18
As far as I have tested the `limit` works perfectly, thank you. When I run 41 factories: instant results, with 81 it seems to go "forever" I can provide some sample data in a chat if you would like. — SomeGuy30145, Jul 05 '23 at 00:13
OK sure. post the *exact* model you are using and any data...happy to give it another look. I'm not sure how you'd chat that, but if you know, great. If that doesn't work, you can either update your post, or just ask a new question and comment me — AirSquid, Jul 05 '23 at 00:20

score 1 · Answer 2 · answered Jun 18 '23 at 22:51

1

The simplest approach would be to use brute force. If you can't express your objective as a linear function, that might be your best bet.

Example:

dims = 5
ranges = ((slice(0, 2, 1),) * dims)
x = scipy.optimize.brute(objective, ranges, finish=None)

answered Jun 18 '23 at 22:51

Nick ODell

15,465
3
32
66

Scipy Optimize Minimize For Combinatorics - Integer only solutions

2 Answers2

CODE:

OUTPUT (truncated):