How to Start Modeling with Integer Programming

11 min readMar 31, 2016

In this post, we will introduce, from a very qualitative perspective, the integer programming. We will try to avoid all the unnecessary technical details required to understand the potential behind this technique.

What is an Integer Programming Model?

Roughly speaking, we can define three main pillars that hold an integer programming model: (i) a set of variables, (ii) some constraints and (iii) an objective function. Usually, these variables are linearly related in the constraints and in the objective function. That is the reason why these models are referred to as integer linear programming. Further details as to these pillars are included below:

It should be highlighted here that a predefined subset of variables can only take integer values. In addition, we can delimit the values of these variables between an upper and lower bound. If there are variables that are not explicitly imposed to be integer, we add another term to the name, namely mixed-integer linear programming.
The constraints are usually linear inequalities or equalities
Finally, the objective function is nothing but a linear combination of the variables. This function can be maximized or minimized depending on the user’s demands.

After solving the equations, we obtain the value of the variables leading to the optimum of the objective functions. In essence, these models are solved using algebraic techniques (e.g. simplex algorithm[1]) coupled with algorithms that ensures the integrality of the corresponding variables (for example branch-and-bound[2] or heuristic algorithms[3] among others)

We can find very sophisticated solvers able to deal with models accounting for millions of variables and constraints (http://www-03.ibm.com/software/products/en/ibmilogcpleoptistud/ http://www.gurobi.com/). In addition, we can also resort to open source solvers that can also account for large-scale problems (www.coin-or.org/).

In the vast majority of the scenarios, solving the problem is not an issue. Nevertheless, if solving the model exceeds the available computational resources, most of the commercial solvers can be tuned so as to adapt the mathematical solving procedures to the requirements of the model being considered. The scientific community is undertaking a large effort to develop more efficient mathematical approaches to solve these models, but this post is not focused in this branch of the field. On the contrary, this post aims at showing how a given logic can be included in a model, namely being rewritten in terms of the aforementioned variables, constraints, and the objective function.

An Illustrative Example: Shall I bake Cookies or Loafs of Bread?

In the following lines, you can find a very illustrative example presenting the role played by each element described above; the variables, constraints, and the objective function.

Let us assume that we have a bakery with an oven. In that oven we can cook both, bread or cookies. Unfortunately, we have a limited space, we cannot use more than 8 square feet. Each of the recipes requires a certain amount of resources. In particular:

Cooking a loaf of bread requires $1 of resources (energy and ingredients) as well as 0.6 square feet in the oven. We can sell each loaf of bread generating a profit of $0.4.
Cooking a cookie requires $2 of resources (energy and ingredients) as well as 0.8 square feet in the oven. We can sell each cookie generating a profit of $0.7.

Finally, we cannot excess a daily budget of $15 for all the expenses.

How many of which recipes should you cook, in order to maximize the profit?

Let’s get started here defining the variables. We will define only two variables in this illustrative example: b is the number of loafs of bread cooked whilst c is the number of cookies being cooked. We will impose that both variables can only take integer values.

Then, we will start posing all the constraints representing the logic posed in the wording.

We will start with the limited size of the oven. Note that the occupied area is precisely the number of loafs of bread times the size of each loaf of bread plus the number of cookies times the size of each cookie

Similarly, we also impose a limit in the daily budget:

We only have to define the objective function. In this case, we want to maximize the global profit. It is straightforward to deduce that this leads to the following function to be maximized:

We get that the optimal solution is:

Total number of loafs of bread, b = 9.
Total number of cookies, c = 3.

After presenting this toy example, we are at the disposal of addressing a more intricated model. Let me highlight here that the following model was genuinely developed for the sake of this post. Therefore, I obviously noticed that it can be substantially improved but I understand that it properly illustrates the “art” of formulating models in terms of variables, constraints and the objective function within the realm of integer programming, which is, at the end of the day, the aim of this post.

Solving the cryptographic twist proposed by the GCHQ

This year, the Government Communications Headquarters (GCHQ), one of the three intelligence agencies of the United Kingdom, sent a funny Christmas card with a cryptographic twist. The figure attached below was included:

From the official page of the GCHQ, it could be read:

In this type of grid-shading puzzle, each square is either black or white. Some of the black squares have already been filled in for you.
Each row or column is labeled with a string of numbers. The numbers indicate the length of all consecutive runs of black squares and are displayed in the order that the runs appear in that line. For example, a label “2 1 6” indicates sets of two, one and six black squares, each of which will have at least one white square separating them.

After understanding the logic behind the problem, the first step to be given is to define the variables. Be patient here, it may look like the definition of these variables is arbitrary but soon you will see how everything comes together after understanding the constraints.

The first variable is zij. This variable will represent the final color of a square. In particular, if zij=0 the square in the i-th row and j-th color will be white. Analogously, if zij=1 the square in the i-th row and j-th color will be black. Note how this variable is of utmost importance to properly understand the output of the model. Since it only can take two values, namely 0 or 1, we will consider zij as a binary variable.
The second variable will indicate the position in which a certain cluster of black colored squares starts. This variable will support the formulation rather than being a wanted output. In particular, we will define two set of variables wcjmi and wfimj for the clusters in columns and rows respectively. Given the similarity y of both variables, it will be only detailed for the first set of variables, namely wcjmi. wcjmi = 1 if the m-th cluster of squares at column j starts in the i-th row. For example, considering the eighth column (1–1–3) (j=8), the third cluster of black squares (m=3) will start in the i-th A priori we do not know this position but this is precisely what the model will return after being solved. As happened with the previous variables, both wcjmi and wfimj only take binary values.
The last variable is xcjm. This variable will be equal to the position in which the m-th group of black squares starts for column j. In this case, as we are expecting a position, the variable will not be binary but integer (the number of the row). Analogously, we define the same variable but for the groups in rows (xfjm).

Let me point that when we define a variable we are essentially defining a family of them. For example, we will have one zij for each square. This is extensible to all the variables.

Then, we will define all the constraints in the model. Each constraint will be tested to proof how they meet the criteria imposed in the problem to be addressed.

The first constraint will connect xcjm and wcjmi as well as xfjm and wfjmi. In particular, this first constraint is the combination of the following two equations:

We need another example here as soon as possible…

Let us consider that the m-th group of black squares at the j-th column starts in the k-row. M represents a large integer, for this particular case, we will consider M being equal to 50. In that case, wfimj = 1 and, consequently we want to impose that xcjm = k. Therefore, we would analyze the behavior of the system of equations above:

On the one hand, if wfimj = 1, the first equation is rewritten as follows:

Therefore, xcjm can take any value below or equal to k. In addition, the second equation is rewritten as follows:

Note how this equation is imposing that xcjm can take any value above or equal to k. So, putting everything together, we are imposing that xcjm = k if wfimj = 1.

On the other hand, if wfimj = 0, the first equation is rewritten as follows:

Whilst the second equation leads to:

In this case, if wfimj = 0, xcjm can take any value below or equal to M. Nevertheless, as will be discussed later on, our objective function will try to minimize the xcjm. Therefore, if wfimj = 0, then xcjm = 0.

In the same way, we also included a very similar couple of equations but accounting for the blocks in rows:

The third pair of equations is responsible for coupling the zij and the wcjmi and wfimj. Note that if wcjmi or wfimj are equal to 1, means that, starting from that row or column respectively, a given number of squares should be black colored. We will include this logic in the model starting with the couple wcjmi and zij.

Here, yci,m is nothing but the size of the m-th cluster of black squares in column i-th. As done with previous equations, we will analyze the behavior of the equations in different scenarios:

On the one hand, if wfimj = 1, the first equation is rewritten as follows:

Note that zij can only take binary values. Therefore, if we are imposing that the sum of yci,m-th contiguous elements has to be equal to yci,m means that all the accounted zij should be equal to 1. In other words, if wfimj = 1, the yci,m-th contiguous zij, starting from i,j, will be equal to 1, namely black colored.

The second equation leads to:

Which is a redundancy of the previous equation given the binary nature of zij.

On the other hand, if wfimj = 0, the first equation is rewritten as follows:

Given that M is a very big scalar this equation will be satisfied always. Note how the same conclusion is reached for the second equation:

Overall, if wfimj = 0, zij is not being affected at all.

In the same way, we also included a very similar couple of equations but accounting for the blocks in rows:

Here, yfi,m is the number of contiguous squares in the m-th cluster of row i.

Finally, we will include in the model the fact that each cluster of black squares will have, at least, one white square between them. For that, we will make use of xcjm and xfim.

These two equations are applied for each cluster length except the last one. This is because the last cluster of squares does not need to have any white square after it. In essence, this two equations impose that if a certain cluster starts at a given position, it is forbidden to the next cluster to start, at least, squares away. This logically means that, at least, it will be a white square between them.

Finally, we only have to define the objective function to be optimized. The reader should note here that, so as to properly relate wfimj and xcjm it was required to explicitly minimize the values of xcjm. Therefore, we will consider the minimization of the sum of these variables as the linear function to be minimized:

So, at this point, we have all the variables, constraints, and the objective function properly defined. As mentioned early on, we can now call the solver with our model and get the optimal solution. Remember here that zij will allow us to draw the final distribution of white-black squares. In particular, after only a few seconds of computation time, we obtained the following landscape:

And this is how the puzzle finally looks like…

The image hidden was a QR code that once scanned sent you to the second part of the Christmas Puzzle.

British sense of humor 😉

The goal of this post was to present you a very versatile mathematical technique that has been applied in different fields, ranging from operational research[4] to bio-mathematical applications[5], among others. Nevertheless, we know that the addressed problem entails a significant difficulty. Therefore, if there is any issue preventing you from following the reasoning posed above, I encourage you to pose any question, comment the post or generate a debate here. But if you were able to follow all the steps given, congratulations!, you can consider yourself fully prepared to take part in the art of formulating models into the integer programming language.

[1] Dantzig, G. B., Orden, A., & Wolfe, P. (1955). The generalized simplex method for minimizing a linear form under linear inequality restraints. Pacific Journal of Mathematics, 5(2), 183–195.

[2] Lawler, E. L., & Wood, D. E. (1966). Branch-and-bound methods: A survey.Operations research, 14(4), 699–719.

[3] Glover, F. (1977). Heuristics for integer programming using surrogate constraints. Decision Sciences, 8(1), 156–166.

[4] Winston, W. L., & Goldberg, J. B. (2004). Operations research: applications and algorithms (Vol. 3). Boston: Duxbury press.

[5] Pey, J., Prada, J., Beasley, J. E., & Planes, F. J. (2011). Path finding methods accounting for stoichiometry in metabolic networks. Genome Biol, 12(5), R49.

Originally published at blog.quarizmi.com on March 31, 2016.