In the previous section we saw how the feasible region of a linear program is constructed as the intersection of a number of halfplanes, one for each constraint. Using this insight we developed a graphical approach for solving 2D LPs. Together with Fourier-Motzkin elimination as a means for reducing the dimensionality of an LP, this gave us a general method for solving linear programs.

Unfortunately Fourier-Motzkin produces a rather large number of constraints, so it is unsuited for solving problems with many variables.

In this section I will talk more about the properties of the feasible region. With more knowledge about the feasible region, we’ll eventually be able to understand a very popular method for solving linear programs, the Simplex Method.

Let us start by introducing some notation and some terms.

Already in the last part we started writing the variables x1,,xnx_1,\ldots, x_n as a column vector x\vec x and the constants aija_{ij} as a row vector ai\vec a_i. We did the same with the cost function, writing the weights for each variable as a row vector c\vec c. This way we could use the dot product instead of writing sums explicitly.

aix=ai1x1++ainxn \vec a_i \cdot \vec x = a_{i1}x_1+\ldots+a_{in}x_n

The constraints can be written using a matrix vector product. We write the ajia_{ji} in a matrix AA like so

A=(a11a12a1na21am1amn) A=\left(\begin{aligned} a_{11} & a_{12} & \ldots & a_{1n}\\ a_{21} & \ldots \\ \vdots & \\ a_{m1} & \ldots & & a_{mn} \end{aligned}\right)

Note that the rows of AA are simply the vectors ai\vec a_i.

Now we can write the constraints simply as AxbAx\le b, where bb is a column vector of the bjb_j. Summarizing we get the general form of a linear program:

minimize:cxsubject to:Axb\begin{aligned} \text{minimize:}& \vec c \cdot \vec x \\ \text{subject to:} & A \cdot \vec x \le \vec b\end{aligned}

Note that it really doesn’t matter whether we use \le or \ge. Just multiplying by 1-1 switches a constraint around without changing it.

Def A polyhedron is a set in Rn\text{R}^n whose members obey a set of linear inequalities

{xRn|Axb}ARm×n,bRm\{x\in \text{R}^n | Ax \geq b\} \qquad A\in \text{R}^{m\times n},\ b\in \text{R}^m

If the region is bounded (i.e. it has a finite volume) it can also be called polytope. We say nn is the dimensionality of the polyhedron.

I already used the word halfspace without giving a formal definition. Let us remedy this.

Def Let a,xRna,x\in \text{R}^n, a0a\neq 0.

  1. {x|ax=b}\{x|ax=b\} is a hyperplane (a line in 2D)
  2. {x|axb}\{x|ax\geq b\} is a halfspace (halfplane in 2D)

With this definition we can say that a polyhedron is an intersection of a bunch of halfspaces.

Corners of Polyhedra

A corner of a nn-dimensional polyhedron is, intuitively, a point where nn edges meet. I will give a bunch of different definitions and them prove them to be equal.

The simplest definition uses a line. A corner of a polyhedron is a point pp in the polyhedron where we can find a line that touches the polyhedron only at pp.

Def Let PP be a polyhedron. A vector xPx\in P is a vertex of P if cRn\exists \vec c\in \text{R}^n s.t. cx<cycx < cy for all yP,yxy\in P, y \neq x; that is, xx is the minimal point for some cost vector cc (the unique optimal solution for some LP with the feasible set P). See figure [Fig:vertex]

[Fig:vertex] A vertex x is the optimal solution for a cost vector c

[Fig:vertex] A vertex xx is the optimal solution for a cost vector cc

Corners are interesting for optimization because the converse is also kind of true, at least for bounded polyhedra. For any cost vector cc, we can find a vertex xx of the (bounded) polyhedron such that cxcycx \le cy for all points yy in the polyhedron. There is no strict inequality here because the line defined by cc might be parallel to one of the edges of the polyhedron. If the polyhedron is not bounded, there are some cc such that for any point yy there is a point yy' such that cy<cycy' < cy, that is, the optimal value for this cost vector is unbounded. These are of course the cost vectors that define a line that doesn’t leave the polyhedron.

However, we need some more definitions and theorems before we can prove the above statement.

Def An extreme point of a polyhedron P is a vector xPx\in P s.t. xx is not a convex combination of any two distinct vectors y,zPy,z\in P different from xx.

Convex combinations of two points x,yx,y are all points zz for which the equation λx+(1λ)y=z\lambda x + (1-\lambda) y = z has a solution for λ\lambda in [0,1][0,1]. Geometrically, the points zz lie on a line between xx and yy. You can also take the convex combinations of more than two points. The principle is the same though, you add all the points, scaling each one with a non-negative scaling factor λi\lambda_i. The combination is convex if the λi\lambda_i sum to 1.

Example: In 2D we can always select the two adjacent corners of a point xx on the edge of PP if and only if xx is not a corner. Then xx will be on the line between the two corners.

Def Let PP be a polyhedron that is defined by some linear inequalities aia_i: P={x|aixbi}P=\{x|a_ix\geq b_i\}. We’ll say that the ii-th constraint is active at a point xx if we have equality: aix=bia_ix = b_i

The constraints define the edges of polyhedron. If a point is on an edge that constraint is active. Intuitively the point must be a corner if it lies on the intersection of nn edges.

Def Let PRnP\in \text{R}^n be a polyhedron in standard form and let x*Rnx^{*}\in \text{R}^n. The vector x*x^{*} is a basic solution if at x*x^* all equality constraints are active and there are nn active constraints that are linearly independent. See solution AA in figure [Fig:bsfVSbs] for an example of a basic solution. If a basic solution is also feasible, we call it a basic feasible solution (b.f.s.). See the point BB in figure [Fig:bsfVSbs] for an example of a basic feasible solution.

[Fig:bsfVSbs] Some LP. Solution A is a basic solution. Solution B is a basic feasible solution.

[Fig:bsfVSbs] Some LP. Solution AA is a basic solution. Solution BB is a basic feasible solution.

A polyhedron with fever than nn constraints never has a basic solution.

Now we want to prove that the three definitions are all equivalent to each other.

Theorem Let PP be a polyhedron and xPx\in P. The following statements are equivalent

  1. xx is a vertex
  2. xx is an extreme point
  3. xx is a basic feasible solution

Proof

We show 1.) \Rightarrow 2.) \Rightarrow 3.) \Rightarrow 1.).

The shape of polyhedra

So far we’ve shown that the extreme points of a polyhedron, i.e. its corners, are also vertices, that is, optimal solutions for some cost vector. However, we want to be able to say that for every cost vector we can find an optimal solution that is also an extreme point.

In this section we’ll learn that polyhedra are so called convex sets and, for bounded polyhedra, every point inside the set can be expressed as a linear combination of the extreme points. Recall that a vector xx is a linear combinations of some vectors y1,,yny_1,\ldots, y_n if you can find constants λ1,,λn\lambda_1,\ldots, \lambda_n such that x=iλiyix=\sum_i \lambda_i y_i, or in vector notation x=λyx=\lambda \cdot y.

Def A subset SRnS\subseteq R^n is called convex if for any to points x,ySx,y\in S the line connecting xx and yy is also in SS. That is

λ[0,1],x,yS:λx+(1λ)yS\forall \lambda \in [0,1], \forall x,y\in S: \lambda x + (1-\lambda)y \in S

Theorem Polyhedra are convex sets.

Proof This follows directly from the definition. Recall that we defined a polyhedron as the set of points

{xRn|Axb}\{x\in \text{R}^n | Ax \geq b\}

for some matrix A. Since matrix multiplication is linear we have for some y1,y2y_1,y_2 from PP that

A(λy1+(1λ)y2)λb+(1λ)b=bA\cdot (\lambda y_1 + (1-\lambda) y_2) \ge \lambda b + (1-\lambda) b = b

Note that the notion of convex combination can be extended to sets of more than two points trivially. It is a simple exercise to show that this doesn’t change what it means to be a convex set. I will just use it.

Def The convex hull of a set of vectors X=x1,,xnX=x_1,\ldots, x_n is the set

CH(X)=λi=1{i=1nλixi}\text{CH}(X)=\bigcup_{\sum \lambda_i=1} \left\{\sum_{i=1}^n \lambda_i x_i\right\}

That is, the convex hull is the union of all convex combinations that can be formed from the vectors in XX.

Intuitively the convex hull is the set you get by spanning a tight rubber band around the vectors of XX.

We come to the central result of this section. The next theorem shows that the extreme points of a polyhedron span the whole polyhedron. This is what allows us to only look at the extreme points when looking for an optimal solution to a LP.

Theorem Let PP be a non-empty bounded polyhedron and let EE be the set of extreme points of PP. Then P=CH(E)P = \text{CH}(E)

Proof We show both directions. First we show CH(E)P\text{CH}(E) \subseteq P. This direction is particularly easy. We already showed that polyhedra are convex sets, hence any convex combination of points from PP still lies in PP.

Now we show the other direction PCH(E)P \subseteq \text{CH}(E). To do so we show that an arbitrary point xPx\in P can be expressed as a convex combination of extreme points. The proof of this is very similar to the proof we gave above that extreme points are basic feasible solutions.

Consider AxbAx \ge b. We rearrange the rows of AA, xx, and bb such that the first kk inequalities are tight. We can decompose AA into two matrices BB and CC such that A=B+CA=B+C by taking BB as the first kk rows of AA (and fill it with 0) and CC as the last nkn-k rows of AA (and fill with 0 from the top).

If the rank of BB is nn, then xx is a basic feasible solution (i.e. an extreme point by the theorem above) and we’re done. Otherwise we can find some vector yy in the kernel of BB, i.e. By=0By=0. Since the equalities defined by CC are not tight, we can move xx by some ϵ1\epsilon_1 in the yy direction without leaving the polyhedron. We chose ϵ1\epsilon_1 such that at least one constraint of CC becomes tight for x=x+ϵ1yx=x+\epsilon_1 y. There also has to be an ϵ2\epsilon_2 that takes us to a boundary in the opposite direction. Let x2=xϵ2yx_2=x-\epsilon_2 y.

As in the previous proof xx can be expressed as a convex combination of x1x_1 and x2x_2. If x1x_1 and x2x_2 are extreme points, we’re done. Otherwise we can repeat the same process for x1x_1 and x2x_2, only this time the rank of BB is one greater than before. Eventually we will reach full rank and hence a set of extreme points that can express xx as a convex combination.

Now that we know that every point in the polyhedron can be expressed as a convex combination of extreme points, we can use this fact to show that for every const vector cc we can find an optimal point that is also an extreme point.

Corollary Let PP be a non-empty bounded polyhedron. Then the LP min cx\text{min } cx such that xPx\in P has an optimal solution that is an extreme point.

Proof Let x*x^* be an optimal solution and let v=cxv=cx be the optimal objective value. By the previous theorem we can write x*x^* as a convex combination of extreme points

x*=λiyi.x^* = \sum \lambda_i y_i.

We want to show that there is a yiy_i such that cyi=cx*c y_i= c x^*. We already know that v=cx*v=cx^* is minimal, so we know that cyivcy_i\geq v for all ii. Since the λi\lambda_i must sum to 1, we can conclude that for at least one yiy_i we actually have cyi=vcy_i=v. Otherwise can can quickly derive the following contradiction:

v=cx*=cλiyi>cλix*=λicx*=vλi=vv=c x^* = c\sum \lambda_i y_i > c\sum \lambda_i x^* = \sum \lambda_i cx^* = v\sum \lambda_i = v

Now we have all ingredients for an algorithm. We have shown that for every cost vector there is an extreme point that is an optimal solution. Therefore we can find the optimal solution to a LP by enumerating extreme points. We also know that extreme points are basic feasible solutions, that is, points where the matrix of active constraints has full rank. That is, once we found one basic feasible solution we can move to another one by manipulating a matrix – throwing out one active constraint and replacing it by a different one. This is the basic strategy of the Simplex method, that we’ll discuss in the next section.

(The next section is coming when I get around to writing it.)

Click here to go back to the index


CC-BY-SA Adrian Neumann (PGP Key A0A8BC98)

adriann.github.io

RSS