Recall from the introduction that a linear program is defined by a linear objective function f(x)=c1x1+c2x2++cnxnf(x) = c_1x_1+c_2x_2+\ldots+c_nx_n and a set of constraints ai1x1+ai2x2+ainxnbia_{i1}x_1+a_{i2}x_2+a_{in}x_n \le b_i for 1im1\le i\le m. For reasons that will become clear in a few paragraphs (and even clearer in the next part of this series), I will write the variables as a column vector x=(x1,,xn)T\vec x=(x_1,\ldots, x_n)^\text{T}, where I use the superscript TT to indicate transposition so that I can write column vectors in the rows of this text.

When we treat the variables as a vector like this, it makes sense to call the number of variables the dimension of the LP. I will call vectors that satisfy all constraints feasible. If there is no such vector, the problem is infeasible.

Solving 1D Linear Programs by combining constraints

Solving 1-dimensional LPs is trivial. We have just one variable xx. The cost function either becomes bigger as xx increases or smaller, so we immediately know whether we’re looking for the biggest xx that satisfies the constraints or the smallest. The set of feasible xx is then the intersection of the rays defined be the constraints. For example

min3xs.t.x3x5x7\begin{aligned} \min & 3x \\ \text{s.t.} & x\ge 3 \\ & x\ge 5 \\ & x\le 7 \end{aligned}

The constant before the xx in the cost function is positive, so in order to minimize the objective value we need to minimize xx. The constraints can be simplified to 7x53x[5,7]7 \ge x \ge 5 \ge 3 \rightarrow x\in [5,7] and thus the optimal value is x=5x=5.

Solving 2D Linear Programs by Eye-Balling

In two dimensions we can use a very similar strategy. The cost function still tells us in which direction to move x\vec x and the intersection of the constraints gives us a feasible region. We use a graphical approach to see the set of feasible values for xx.

If you write an equals sign like this ai1x1+ai2x2+ainxn=bia_{i1}x_1+a_{i2}x_2+a_{in}x_n = b_i, you get an equation for a line. If you write the constants as a vector a=(ai1,,ain)\vec a=(a_{i1}, \ldots, a_{in}), you can write the line equation using the dot product: ax=bia\cdot x=b_i. Note that aa is orthogonal to the line. Why is that so? Take some xx that lies on the line, i.e. it solves the above equation. Moving xx by some amount zz along the line results in a vector that lies on the line and hence still satisfies the equation. The dot product is distributive so we can write a(x+z)=ax+aza \cdot (x+z)= a\cdot x + a\cdot z. We know ax=bia\cdot x=b_i (since this is how we chose xx), so the second summand must be zero. When the dot product is zero, the two vectors are orthogonal.

Now consider what happens to the equation when we take an xx from the line and move it by some zz that is not parallel to the line. It depends on aza\cdot z. If the dot product is positive, the left hand side becomes too big for the equality, otherwise it becomes too small. The dot product is positive if zz points somewhat in the same direction as aa (more formally, zz can be decomposed in two vectors, zz' and zz'', such that zz' is orthogonal to aa and zz'' is λa\lambda a for some positive real λ\lambda).

So if we take the original inequality, it cuts the 2D plane in two parts, left and right from the line. Which part satisfies the inequality depends on the direction of aa and whether we have a \ge or a \le. The set of points that satisfy the inequality is called a halfplane.

The feasible region of the problem, i.e. all the points that satisfy all constraints, is then the intersection of all the halfplanes.

The cost function also defines a line, or rather, a family of lines, one for each objective value, namely the solutions for f(x)=zf(x)=z for each objective value zz. If the intersection of this line with the feasible region is not empty, there is a solution (any point in the intersection) with cost zz. The cost vector cc is orthogonal to the line defined by ff. The goal is to shift the line as far in negative cc direction as possible without leaving the feasible region. Take for example the following LP

Fig:graphSolutionEx An example for a graphical solution of an LP. The optimal solution is (3,2)

Fig:graphSolutionEx An example for a graphical solution of an LP. The optimal solution is (3,2)

minx+ys.t.2x+y82x+3y12\begin{aligned} \min & x+y\\ s.t. &2x+y\geq 8\\ &2x+3y\geq 12\end{aligned}

The cost vector is c=(1,1)c=(1,1) and the two constraints define two half-planes. For every value zz of the cost function, we have a line x+y=zx+y=z. See figure [Fig:graphSolutionEx]. We want to find the smallest value of zz such that (x,y)(x,y) is a feasible solution. So we start with some arbitrary (x,y)(x,y) in the feasible region (picking it is the eye-balling step) and then gradually move it in c-c direction (i.e. we subtract small multiples of cc). Once we find that we can’t move our solution any further without leaving the feasible region, we have reached the optimum.

A somewhat weaker operation than picking a feasible solution to LP is picking an objective value zz and finding out whether this objective value can be achieved.

To do so, we can add a new constraint, namely f(x)zf(x) \le z. This constraint adds another half-plane that constrains the feasible region. If this makes the feasible region empty, there is no solution with an objective value at most zz. If the feasible region still contains more than one point, then we can decrease zz a little more. We have reached the optimal zz if the feasible region contains only one point. Now, if we had a method to check whether a feasible region is empty or not, this would give us an algorithm to solve linear programs with O(log(zopt))O(\log (z_{\text{opt}})) many evaluations of the feasibility algorithm.

Fourier-Motzkin Elimination

Now that we know how to solve 1D LPs rigorously and 2D LPs with a graphical method, let us try to find a method that works for arbitrary dimensions. As we already can solve low-dimensional LPs, it makes sense to try and get there via dimensionality reduction.

Fourier-Motzkin elimination is a method to reduce the dimension of an LP by one without changing feasibility. If we keep reducing the dimension one by one, we eventually reach the 1D case, where we can test feasibility easily. The objective value is disregarded by this method, we only care about feasibility. By adding an additional constraint as described above we can transform the optimization problem to a feasibility problem. In an exercise you will show how to reconstruct an optimal solution.

The method works by rearranging the constraints to solve for one variable and then introducing enough new constraints to make the variable unnecessary. As an example we will use this 2D LP.

maxys.t.2x+7y284x2y20x+y6y0 \begin{aligned} \text{max} & y \\ \text{s.t.} & 2x + 7y &\le 28 \\ & 4x - 2y &\ge 20 \\ & x + y & \ge 6\\ & y &\ge 0 \end{aligned}

Exercise: Draw the feasible region for this LP.

We want to eliminate yy. So first we rearrange all constraints to have just yy on the left side. We get

y42x/7y10+2xy6xy0\begin{aligned} y & \le 4 - 2x/7\\ y &\le -10 + 2x\\ y &\ge 6 - x\\ y &\ge 0 \end{aligned}

Now we can combine the constraints where yy\le \ldots with the constraints where yy\ge \ldots. As we have two of each, we get four constraints.

42x/7y6x42x/7y010+2xy6x10+2xy0 \begin{aligned} 4 - 2x/7 &\ge y \ge 6 - x \\ 4 - 2x/7 &\ge y \ge 0\\ -10 + 2x &\ge y \ge 6 - x \\ -10 + 2x &\ge y \ge 0 \end{aligned}

Exercise Add these constraints to your picture.

We can simplify to

5x/722x/743x162x10 5x/7 \ge 2 \quad 2x/7 \le 4 \quad 3x \ge 16 \quad 2x \ge 10

which further simplifies to

x14/5x14x16/3x5 x \ge 14/5 \quad x \le 14 \quad x \ge 16/3 \quad x \ge 5

and lastly

max(14/5,16/3,5)xmin(14). \max(14/5,16/3,5) \le x \le \min(14).

If you did the exercise above correctly, you will see that projecting the feasible region down to the xx axis leaves you with the interval 5x145\le x \le 14. Note that the simplification we did to the constraints after eliminating yy was in fact the procedure to eliminate xx. A 0-dimensional LP has no variables, just a bunch of inequalities involving numbers and the feasibility check is just some calculation.

Exercise: Instead of yy, eliminate xx. Does your result match your picture?

In general Fourier-Motzkin elimination works as follows:

  1. Pick a variable yy to eliminate.
  2. Rearrange all constraints to have only yy on the left side. You get three kinds on inequalities:
    1. Inequalities that don’t contain yy
    2. Inequalities that bound yy from above.
    3. Inequalities that bound yy from below.
  3. Keep type 1 inequalities.
  4. For each type 2 inequality, take all type 3 inequalities and write down the combined inequality.

Exercise: Use Fourier-Motzkin Elimination to find not just the optimal objective value, but also the optimal solution.

Let’s try to prove that Fourier-Motzkin Elimination is correct, that is, it reduces the dimension by one, without changing satisfiability. Let us first define the projection of a set of vectors. Since we can choose the order of our variables without changing the problem, it suffices to deal with the case where we drop the last coordinate of each vector.

Definition Let x=(x1,,xn)\vec x = (x_1, \ldots, x_n) be a vector. Then π(x)=(x1,,xn1)\pi(\vec x) = (x_1,\ldots,x_{n-1}) is the projection of xx on the first n1n-1 coordinates. For a set of vectors SS, let π(S)={π(x)|xS}\pi(S) = \{\pi(x) | x \in S \} be the set where we apply π\pi on every element.

It is easy to see that for any non-empty set SS, π(S)\pi(S) is also non-empty. So if SS is the feasible set of our linear program, projecting it down doesn’t change feasibility. So far so good. Unfortunately we don’t have the feasible region given as a point set, it’s given by the inequalities. Hence we need to use Fourier Motzkin elimination instead of just dropping a coordinate. Next we will show that Fourier Motzkin elimination indeed computes the projection.

Proof Let SS be the feasible region of our LP, and let QQ be the feasible region of the LP after we remove the last variable xnx_n via Fourier Motzkin Elimination. We show Q=π(S)Q=\pi(S). We do so by proving mutual inclusion, that is π(x)Qπ(x)π(S)\pi(x)\in Q \Rightarrow \pi(x) \in \pi(S) and π(x)π(S)π(x)π(Q)\pi(x) \in \pi(S) \Rightarrow \pi(x) \in \pi(Q)

  1. π(x)π(S)π(x)Q\pi(x) \in \pi(S) \Rightarrow \pi(x) \in Q. That means we have a point π(x)\pi(x) in π(S)\pi(S) and we want to find a π(x)\pi(x) in QQ that corresponds to it. In SS the vector xx has another component, so let us write in a slight abuse of notation x=(π(x),xn)x = (\pi(x),x_n). We don’t know xnx_n (and there are in general many xSx\in S that get mapped to π(x)\pi(x)) but we know that it had to satisfy the constraints.

    There are three kinds of constraints in our LP after we rearrange the inequalities to have xnx_n only on the left hand side, as explained above. Type 1 constraints don’t contain xnx_n and hence have no influence on xnx_n, so if xx satisfies them π(x)\pi(x) also satisfies them. For type 2 and type 3 inequalities, we know that there is an xnx_n that satisfies all of them.

    We constructed QQ by combining type 2 and type 3 constraints. Let T2T_2 be the r.h.s. of the tightest type 2 constraint, that is, the constraint where the right hand side evaluates to the smallest number when we plug in π(x)=x1,,xn1\pi(x) = x_1,\ldots,x_{n-1}, and similarly let T3T_3 be the r.h.s. of the tightest type 3 constraint. Note that then T3T2T_3 \le T_2 is the tightest constraint for π(x)\pi(x) that we have in QQ. Since xnx_n satisfies all constraints, we know T3xnT2T_3 \le x_n \le T_2 and hence T3T2T_3 \le T_2. Thus π(x)\pi(x) satisfies the constraints for QQ and hence π(x)Q\pi(x) \in Q.

  2. π(x)Qπ(x)π(S)\pi(x)\in Q \Rightarrow \pi(x) \in \pi(S). For this direction also we have to choose xnx_n so that x=(π(x),xn)x=(\pi(x),x_n) satisfies the constraints for SS. As in the other direction let T2T_2 be the r.h.s. of the tightest type 2 constraint, and let T3T_3 be the r.h.s. of the tightest type 3 constraint. Then we can simply choose any xn[T3,T2]x_n \in [T_3, T_2]. This interval can’t be empty, because we have the constraint T3T2T_3 \le T_2 in our constraint set for QQ and π(x)\pi(x) satisfies all of those constraints.

And this concludes our proof.

So with Fourier Motzkin elimination we have a method of checking feasibility for Linear Programs: Eliminate dimensions until no variables remain and do the math to see whether each inequality is satisfied. Together with your method for transforming optimization to repeated feasibility, we can now solve Linear Programming. Unfortunately Fourier Motzkin Elimination gets pretty expensive for large dimensions as you’ll see in the next exercise.

Exercise What’s the runtime of an nn-step Fourier Motzkin Elimination? Hint: How many constraints do you introduce in each step?

Next part: Polyhedra
Click here to go back to the index


CC-BY-SA Adrian Neumann (PGP Key A0A8BC98)

adriann.github.io

RSS