Post by Ralf GommersI'm not a user of these optimization methods, so not sure. What would be
helpful is if someone can make a proposal of what the scope of linear
programming methods in SciPy should be, rather than adding them one by one.
There's an argument to be made for having a consistent set of well-known
algorithms in SciPy, but knowing the boundary of the scope here is
helpful especially if there are more powerful solutions like cvxopt that
could become easy to install soon. https://github.com/conda-
forge/cvxopt-feedstock exists, and once the OpenBLAS Windows issue is
solved (I would bet on that happening within a year) there'll be win64
binaries.
I'm not an expert in the scope of LP methods; I've only picked up some
information as I've been implementing this interior point method. But since
there haven't been any replies to your question, I'll attempt to answer,
and hopefully others will correct me. I'll take two approaches: first
identify the algorithms most popular in the literature, then look at the
algorithms used in the most popular software.
*In the literature...*
There seem to be two main branches of LP methods: basis exchange algorithms
(like the simplex method currently used in scipy.optimize.linprog) and
interior point methods (like the primal-dual path following code I want to
contribute).
A moderately deep search into the basis exchange algorithms brings up only
two names: simplex and criss-cross, each with a number of variations. The
simplex method is the easier of the two to visualize: it starts at a vertex
of the polytope defined by the problem constraints, traverses an edge in a
direction that decreases the objective function, and continues until there
are no such edges. Because it has to start at a vertex (and only visits
vertices), the simplex method often requires the solution of a "phase 1"
problem (often as time-consuming as the LP itself) to find a vertex. The
criss-cross method, on the other hand, is not restricted to feasible bases,
so it does not need to solve a "phase 1" problem and gets straight to work
on the LP itself. Unfortunately, this does not guarantee better
performance, as most criss-cross algorithms cannot guarantee that the
objective function is non-decreasing. Both have exponential complexity in
the worst case but fare much better in practice. There does not seem to be
consensus in the literature on whether one typically outperforms the other.
The poor worst-case complexity of basis-exchange algorithms spurred a lot
of work on (provably polynomial-complexity) interior point methods. The
ellipsoidal algorithm, Karmarkar's projective algorithm, and affine scaling
method are a few names of historical importance, but I've seen several
sources (not just Wikipedia...) state that primal-dual path-following
algorithms are the preferred choice today. In a sense, the approach turns
the linear programming problem with equality and inequality constraints
into a nonlinear programming problem with equality constraints (only). The
objective function is augmented with a "logarithmic barrier" that ensures
plenty of slack in the inequality constraints, and the (mildly nonlinear)
KKT conditions for this problem are iteratively solved as the logarithmic
barrier is gradually removed. I like to visualize it as if there is a force
pushing the current iterate point toward the center of the polytope while
another pulls it in the direction of decreasing objective, but I imagine
this is even more simplified/bastardized than my description of the simplex
method above.
Between basis-exchange and interior point branches, I haven't seen any
claims that one is generally superior in practice. While the worst-case
polynomial complexity of interior point methods is obviously better than
the worst-case exponential complexity of simplex, both tend to scale
approximately linearly with problem size for practical problems. Which of
the two performs better depends on the particulars of the problem.
*In software...*
Sandia published a "Comparison of Open-Source Linear Programming Solvers
<http://prod.sandia.gov/techlib/access-control.cgi/2013/138847.pdf>".
Hans Mittelman does benchmarks of optimization software
<http://plato.asu.edu/bench.html>.
Here are the names of the algorithms used in each. Basically there are just
three - primal simplex, dual simplex (which can be viewed as the primal
simplex method applied to the dual problem), and primal-dual path following.
CVXOPT <http://www.seas.ucla.edu/~vandenbe/publications/coneprog.pdf>:
primal-dual path following
QSOPT <http://www.math.uwaterloo.ca/~bico/qsopt/downloads/users.pdf>:
primal simplex, dual simplex
COIN-OR <https://www.coin-or.org/Clp/faq.html>: primal simplex, dual
simplex, primal-dual path following
<https://www.coin-or.org/Doxygen/Clp/classClpInterior.html>
GLPK <https://en.wikipedia.org/wiki/GNU_Linear_Programming_Kit> (see this
<http://www.maximalsoftware.com/solvers/glpk.html> also): primal simplex,
dual simplex, primal-dual path following
lp_solve <http://lpsolve.sourceforge.net/5.5/>: simplex
MINOS <http://www.sbsi-sol-optimize.com/asp/sol_products_minos_desc.htm>:
primal simplex
GUROBI <http://www.gurobi.com/products/features-benefits>: "Primal and dual
simplex algorithms, parallel barrier algorithm with crossover, concurrent
optimization, and sifting algorithm" (Parallel refers to the engagement of
multiple cores. Barrier probably refers to the logarithmic barrier function
used in the "primal-dual path following" above. Crossover seems to refer to
the generation of the optimal basis, as that is not automatically produced
by interior point methods. Concurrent optimization
<https://www.ibm.com/support/knowledgecenter/SSSA5P_12.7.0/ilog.odms.cplex.help/CPLEX/UsrMan/topics/parallel_optim/multi_threaded_parallel/11_concurrent.html>
seems to refer to running different algorithms simultaneously on separate
cores. Sifting is described here
<https://www.ibm.com/support/knowledgecenter/SSSA5P_12.7.0/ilog.odms.cplex.help/CPLEX/UsrMan/topics/cont_optim/simplex/10_sifting.html>.
I don't think there are any fundamentally different algorithms here; just
extensions.)
CPLEX <https://en.wikipedia.org/wiki/CPLEX>: primal simplex, dual simplex,
barrier interior point (which, if they're going to be so vague, can be
considered synonymous with primal-dual path following)
XPRESS <http://www.solver.com/linear-quadratic-technology#Active Set Method>:
primal simplex, dual simplex, and primal-dual path following. (The
active-set method for SQP is also listed.)
MOSEK <https://en.wikipedia.org/wiki/MOSEK>: primal simplex, dual simplex,
and primal-dual path following
Matlab
<https://www.mathworks.com/help/optim/ug/linear-programming-algorithms.html>:
dual simplex (I suspect that it actually chooses between primal and dual
automatically), primal-dual path following (two variants)
*My conclusion:*
While there are lots of variants, there are only two major approaches:
simplex and primal-dual path following. While biased (as I've already
written the code and want to contribute it), I hope I've also shown that
compared to popular software, SciPy is somewhat incomplete without an
interior point method, and could be considered relatively complete with an
interior point method.
I don't know much about whether faster, free code will become more
accessible in Python. If this does happen, would linprog be removed from
SciPy? If not, I would suggest adding the interior point method now.
In my next email I'll compare the performance of my interior point code to
the current simplex implementation. In short, I've tried four types of
randomly generated problems and two non-random but scalable problems at
several scales, and the interior point code is faster for all of them. I'm
actually very surprised, but I have two explanations:
* interior point typically requires fewer iterations than simplex, which
means less overhead in calling compiled code (and more time spent *in*
compiled/optimized
code)
* the interior point code is written to leverage scipy.sparse matrices and
routines
If you have benchmark problems already written for linprog that you'd like
me to test (beyond those in the unit tests), please let me know.
Matt
--
Matt Haberland
Assistant Adjunct Professor in the Program in Computing
Department of Mathematics
7620E Math Sciences Building, UCLA