I want to solve a linear program in python. The number of variables (I will call it N from now on) is very large (~50000) and in order to formulate the problem in the way scipy.optimize.linprog
requires it, I have to construct two N x N matrices (A
and B
below). The LP can be written as
minimize: c.x
subject to:
A.x <= a
B.x = b
x_i >= 0 for all i in {0, ..., n}
whereby .
denotes the dot product and a
, b
, and c
are vectors with length N.
My experience is that constructing such large matrices (A
and B
have both approx. 50000x50000 = 25*10^8 entries) comes with some issues: If the hardware is not very strong, NumPy may refuse to construct such big matrices at all (see for example Very large matrices using Python and NumPy) and even if NumPy creates the matrix without problems, there is a huge performance issue. This is natural regarding the huge amount of data NumPy has to deal with.
However, even though my linear program comes with N variables, the matrices I work with are very sparse. One of them has only entries in the very first row, the other one only in the first M rows, with M < N/2. Of course I would like to exploit this fact.
As far as I have read (e.g. Trying to solve a Scipy optimization problem using sparse matrices and failing), scipy.optimize.linprog
does not work with sparse matrices. Therefore, I have the following questions:
- Is it actually true that SciPy does not offer any possibility to solve a linear program with sparse matrices? (If not, how can I do it?)
- Do you know any alternative library that will solve the problem more effectively than SciPy with non-sparse matrices? (The library suggested in the thread above seems to be not flexible enough for my purposes - as far as I understand its documentation)
- Can it be expected that a new implementation of the simplex algorithm (using plain Python, no C) that exploits the sparsity of the matrices will be more efficient than SciPy with non-sparse matrices?