1

I need to solve a linear program (LP) with a mix of binary and continuous variables (MILP).
The data I use as constraints and objectives come from pandas data frame manipulations, so they are in matrix (or I should probably say numpy array) format, and the variables to solve for are sometimes 100's / 1000's.
I know how to use matrices and vectors to setup and solve MILP's in R, but as I coded part of the data handling in python, I wanted to include the LP part in the python script.

So I looked for MILP solvers for python, and I found this post:

Python Mixed Integer Linear Programming

and this one:

https://realpython.com/linear-programming-python/#what-is-mixed-integer-linear-programming

Apart from many links no longer working in the first post, my impression, after browsing through the docs of some of the suggested solvers, was that the LP functions in scipy, which take matrices and vectors as input, cannot solve MILP's (see also this post); on the other hand, interfaces like pulp etc., which do solve MILP's, seem to have their own very special input where one has to specify one by one the equations and name each variable. That would be very inconvenient, as I already have a simple numpy array where each column is a variable and each row is a vector of constraint coefficients.

Does anyone know of a python interface to a MILP solver that takes matrices and vectors as input, instead of requiring to write everything as explicit text equations?

Thanks!

PS I should probably mention that I am looking for interfaces to non-commercial solvers.


EDIT adding my current conclusion, after following the advice from users, for future record and in case it helps someone else

In short: as pulp does the job and comes with a (by all accounts) very good LP solver (CBC), I gave in and accepted to rewrite my problem as a set of equations and inequalities, instead of using the matrix format.
It seems to work, it's indeed fast, at least for the problems I tried so far.
My advice to anyone trying to use it on large problems with many variables would be: look up LpAffineExpression, it's very useful to put together complicated constraints. But first you need to define your variables using LpVariable. A variable can also be a list of variables (that's what stumped me initially).

Before trying pulp, I tried cvxopt, which is easy to install, but then it's true, like the users below said, it has its own bizarre format, and GLPK unfortunately is not a very fast solver.

When I still tried to stick to my matrix input format, I tried cvxpy. On paper this is very good, and in fact would allow me to write the problem as a sum of squares min, which is in fact what my original problem was (I turned it into a LP by replacing sum of squares with sum of abs values, and converting that to LP by known techniques).
I got all the way to writing my objective and constraints in matrix format, and then the solver failed... in fact even the example problem published in the cvxpy documentation failed with the same error. Big waste of time.

Anyway, lesson learned: no point fighting: sometimes you just have to do what python wants you to do ;)

user6376297
  • 575
  • 2
  • 15
  • Can you update a small example and the expected value of each variable, please? – Corralien Sep 07 '21 at 09:16
  • Just write yourself a wrapper: matrix-form -> whatever-form-you need. This is trivial and keeps full freedom about picking your library. This is valuable, as all libraries have their downsides. – sascha Sep 07 '21 at 09:16
  • @sascha : thanks; thing is, I am quite new to python, as might have transpired from my post; so manipulating objects in complex ways to 'wrap' a matrix into whichever form the solver likes is not going to be 'trivial' at all for me :( In the meantime I found that lpSolve could be a solution, BUT then it only works with Python 2, apparently, and even there the installation is quite a palaver (and frankly from my previous work I also found lpSolve much slower than other solvers). I think for now I will have to stick to R. http://lpsolve.sourceforge.net/5.5/Python.htm – user6376297 Sep 07 '21 at 09:35
  • In fact I see now that `CVXOPT` might be an option. It does use matrices as input. http://cvxopt.org/userguide/coneprog.html#linear-programming . I cannot find any reference to how to specify that a variable must be binary or integer, but then by further digging: https://stackoverflow.com/questions/60243511/cvxopt-glpk-ilp-documentation . So one has to guess that a specific solver must be called, and then the options of that solver perhaps allow to specify variable types. This is *really much, MUCH harder* than with R :( But OK, made my bed by deciding to use python, now I must lie in it :( – user6376297 Sep 07 '21 at 10:08
  • As someone doing stuff like that (nearly) every day: really... don't pick a library because it looks like an easy route. I mean: cvxopt alone locks you in on GLPK (as solver) and CoinOR Cbc (arguably a much better solver) is out of reach then. And i'm not eving touching cxopt's own matrix data-structures (which are not compatible with numpy/scipy). If you want to switch to python, you must be able to convert a `Ax = b` form based on numpy data-structures to something like `m += xsum(w[i]*x[i] for i in range(n)) = c` (over each row, this is Python-MIP).If you struggle with this, i don't know... – sascha Sep 07 '21 at 10:11
  • Thanks @sascha ; the question is: how long does a person want to (or can) spend reinventing each and every individual wheel, to make something work in python? I agree, I need to learn this, and one day I may review my code, if I need more performant solvers. For now however I need to focus on getting things up and running asap, and test the concept. BTW I know the syntax you used, and I used it already in other scripts. What I do not like is that a package forces me to jump through N hoops and use a very awkward input mode for something that is matrix-like in its very mathematical nature. – user6376297 Sep 07 '21 at 11:33
  • BTW #2 : R has an LP package that so far from my own tests, is the absolute fastest I've ever seen, for a non-commercial solver. Much better than GLPK. – user6376297 Sep 07 '21 at 11:36
  • R is nothing special. All solvers are C / C++ / Fortran based and all libraries in all languages link to the exactly same set of solvers (there are only a very few which are good enough). If your R lib provides a better open-source solver than GLPK, it's Cbc. (with high chances) ;-) [Benchmarks](http://plato.asu.edu/ftp/milp.html) – sascha Sep 07 '21 at 11:46
  • Is it? https://cran.r-project.org/web/packages/Rsymphony/index.html . Anyway, we are not going to settle the R vs python dispute here. I guess both have their merits and issues. I can only say that the interfaces to MILP solvers offered by R, I find much less cumbersome and much easier to use, that's all. And as we speak I am coding my MILP in python using CVXOPT, so it's not like I do not *want* to do this in python. I would like to see a python aficionado grapple with R's syntax, then we'll laugh :) – user6376297 Sep 07 '21 at 12:41
  • 1
    Symphony is (imho) a toy-solver compared to Cbc. (There is a reason it's not part of the benchmarks i linked) But well... if it works it works... You can use the same matrix-based interfaces in Python: e.g. Cylp and cvxpy (`cp.Minimize(cp.sum_squares(A*x - b))`). But you will hit much harder roadblocks there not related to the APIs. The former: good luck building it if you are on Windows. The latter: good luck with open-source solver support (besides GLPK). – sascha Sep 07 '21 at 13:27
  • Exactly my point. In R, I can install and run more more or less whatever I want; here, well, you said it yourself : 'good luck'. I will see what comes out. Thanks for your advice. – user6376297 Sep 07 '21 at 13:30
  • 1
    Not sure if this is useful for you, but CVXR/CVXPY has a matrix interface (more flexible than just a single A,b,c matrix/vector representation). The R and Python versions are close in how you would set up a model. The main disadvantage (for me) is that it cannot handle multi-dimensional variables (like `X(i,j,k)`) but that does not seem a problem for your model. (Sidenote: for many practical models `min cx, Ax=b` is not a very useful representation. Hence most serious modeling tools allow some kind of algebraic input). – Erwin Kalvelagen Sep 08 '21 at 08:16
  • Thanks! I was just about to post about this. CBC is indeed available for R too: https://github.com/dirkschumacher/rcbc/ ; and input is in matrix format :) . and I am installing cvxpy https://www.cvxpy.org/install/index.html# and PySCIPOpt https://github.com/scipopt/PySCIPOpt . We will see. What I like better in R is that when I install 'rcbc' I have a clear function that runs CBC. In python there is a lot more complexity. But OK maybe it's its strength, in a way. BTW from the above benchmarks from sascha it looks like SCIP is better than CBC. – user6376297 Sep 08 '21 at 08:19
  • SCIP is great and open-source, but ! not free !. So for many people, it cannot be used. In academics, you would use Gurobi, Cplex, Mosek and co. anyway (at least if you don't want to hack on the solver-core whre SCIP is probably unbeatable!). – sascha Sep 08 '21 at 10:49
  • Indeed. I read the 'license' section in the SCIP documentation, and I understand that it's complicated. I'll leave it for now. I will see if I can get CBC to work in R first, and if it's faster than Symphony, I will switch to it. Might be enough for my relatively simple needs. Then for python, as you pointed out, it might be much harder, unfortunately. I can see python is a very powerful and versatile language, and some things are much easier than in R. However, installing anything a bit more advanced than the basic stuff... you need a PhD in computer science before you even start... :( – user6376297 Sep 08 '21 at 11:14
  • @ErwinKalvelagen : BTW, as cvxopt worked but was too slow, and my attempts to use cvxpy failed, I made the effort to transition to pulp and write the problem in terms of sets of equations. It was very hard, it took me quite some time to figure it out, and I am still somewhat unsure of the result, although an optimal solution is indeed obtained. Hopefully worth it. I guess getting used to the matrix format would be just as hard for someone who has always used the other method. – user6376297 Sep 09 '21 at 15:15
  • A note for future reference: I am reverting to using R's Rsymphony. Python's pulp using CBC as solver did work after the very complicated translation, but was far slower in the tests I ran, to the point it's not even close to being usable for the size of problem I am dealing with. – user6376297 Oct 04 '21 at 20:08

2 Answers2

1

There are some specifics missing and no real definition for how close to exactly what you have as input must be accepted, but Google's OR tools mention a method to define problems for the various solvers they wrap through arrays. This may be workable for you.

For 100s/1000s of variables, this is probably not an option, but I have added an answer to the scipy.optimize question with a toy implementation of branch-and-bound on top of linprog which takes it's args in arrays. If your problem is highly constrained, it might give some answers in a day or two :).

jspencer
  • 522
  • 4
  • 14
  • Thank you, this is interesting! In the meantime I 'surrendered' to the syntax required by pulp, and I have to say it's indeed easier than a matrix format, especially when sparse. As for the execution time, I found that pulp's CBC accepts a time limit parameter, and from some tests it emerged that the CBC solution gets very close to a very satisfactory objective in much less time than it takes to reach absolute optimality, so we are always using that. I am sure further developments are possible, but for now this fits our needs. – user6376297 Dec 05 '21 at 10:14
0

For future reference: Make sure you have at least pip install scipy>=1.9.3 and use from scipy.optimize import milp as explained here using the HiGHS solvers via method='highs'

ABC
  • 189
  • 9