22. Sequencing problems: the Traveling Salesperson Problem¶
A sequencing problem arises
when the decision maker needs to choose
the “best” ordering for a set of objects or in which sequence
to perform a set of operations. The problem consists in determining a
permutation among all feasible ones; to each permutation
a cost is associated, and the problem, in its basic formulation,
consists in the search for the least cost feasible permutation.
There exist various formulations for sequencing problems,
the most common of which corresponds to the case where the
cost of a permutations is given by the sum of
costs associated to contiguous pairs in the sequence.
To be more concrete, a typical sequencing problem is often encountered in
transport logistics from where the model got the name of traveling
salesperson, or TSP: a driver needs to visit a set of
destinations. This situation arises, e.g., in parcel delivery or
collection, or when some service tasks, like maintenance or gas meter
reading, are to be performed in various locations.
There are in general no
capacity constraints; the only requirement is that every destination is visited
once and, at the end of the tour, the driver returns to the starting
base station. Thus the problem can be seen as that of sequencing the
visits, in order to spend as little as possible in the overall travel.
The total cost in this case is the sum of the cost
incurred in moving from one destination node (or from the base
station) to the following one in the sequence, or back to the base station
when the tour is terminated.
This is perhaps one of the best known and most studied research
problem in Operations Research.
The interest in this problem arises both from the fact that it
is a problem which is very simple to describe, but
considerably complex to solve, and from the fact that the applications of the
traveling salesperson problem are many and not limited to delivery and / or
collection of goods.
A possible formalization of the problem can be obtained in
following way. Consider a directed graph whose nodes are
cities (or customers) to be visited. We assume that the graph is
complete, which means that there exists an arc connecting each pair
of nodes.
Sometimes this is not the case: in a road network, as an example, only
some pairs of cities are connected directly by a road. In graphs which
are not complete, a path which goes through each node once and only
once might not exist:
It is evident that, in this figure, no path exists which, starting
from node 0, goes through nodes 1-5 and back to 0 without passing more
than once over a node. A path which satisfies this
requirement would be called an Hamiltonian path.
In some applications the constraint of not
passing more than once through a node might be relaxed. In the
traveling example, it is important to visit each location to deliver
or collect parcels; however, it not forbidden to pass again through an
already visited node, on the way to a different destination. We may
thus create a complete graph in which the arc between two nodes
represents a “shortest path” between the nodes itself. In other
words, a minimum cost path problem (see
chapter Shortest (Minimum Cost) Path Problems) is repeatedly solved to find the
least cost route from each node to each different node in the
graph. Then, if the graph is connected, a complete graph can be
defined in which arcs, representing optimal paths, connect each
pair of nodes. Thus it is always possible to find hamiltonian paths.
From now on, let us assume that the graph is complete, and that costs
associated to each arc are non negative. In order to formulate the optimization problem
we associate an indicator variable to each arc:
\(\mathvar{\delta}_{ij}\). Constraints need to be defined in
such a way that this variable is equal to 1 if and
only if the route prescribes to visit node \(j\)immediately after
having visited node \(i\). Assuming the constraints have been
formulated, the objective function is simply written, exactly in the
same way as in the minimum cost path problem:
In the above objective, \(E\) is the set of arcs in the
graph. If nodes are denoted by \(V = \{0,1, \ldots, n\}\), then
\(E = \{(i,j) \in V \times V : i \ne j\}\).
For what concerns the
constraints, first observe that the tour must touch each node once;
thus, referring to a network flow problem, the total flow in and out
of each node should be equal to 1. That is, one and only one arc of
a feasible tour
will enter each node and one and only one arc will exit from it. In
formulae:
It is easily recognized that the structure of these two groups of
constraints is that of matching constraints (on a graph which,
however, is not
bipartite like in chapter Assignment or bi-partite matching).
Generic (non bipartite) weighted matching problems are easily solvable, and
efficient algorithms do exist (see, e.g., [17]).
Unfortunately these constraints, although necessary for
a feasible solution, are not sufficient
to force the indicator variables to identify a single
closed tour. The following example shows the situation:
foreach source / dest in {3/4,4/5,5/3}
path[->,ultra thick,draw,red] (source) – (dest);
foreach source / dest in {0/2,0/3,0/4,0/5,1/3,1/4,1/5,2/3,2/4,2/5,3/5}
draw[dotted,thin,draw] (source) – (dest);
end{tikzpicture}
In the figure, red arcs corresponds to arcs for which the indicator
variable is 1.
All matching constraints are
satisfied by this solution, but it does not represent a feasible
solution, as the proposed path is not a unique cycle. We thus need to
introduce special purpose constraints to
eliminate sub-tours from feasible solutions. A sub-tour is
a closed circuit which does not contain the
origin node.
The first technique consists of labeling the nodes, i.e.,
assigning a new variable to each node. The idea is to
assigns a variable to each node and, by means of suitable logical constraints,
force the label assigned to a node to be strictly greater than that
assigned to its direct predecessor along the path. Let us denote by \(\var{y}_i\)
the variable associated to node \(i\). Then we would like to
impose logical constraints:
which is impossible. Thus it has been proven that these logical
constraints make any tour infeasible. However it should be recalled
that a tour should be allowed, namely one which returns to the base
station. So it is necessary to impose the logical constraints for
each pair of nodes, provided that the second one is not the origin:
To correctly formulate this constraint, we can proceed in the
following way:
if \(\mathvar{\delta}_{ij} = 1\) then the difference
\(\var{y}_{j} - \var{y}_{i}\) must be greater than or equal to
one (or to any other positive quantity). Otherwise,
it is necessary to find a lower bound on the same difference in such a way that the
generated constraint is redundant. If we fix, arbitrarily,
\(\var{y}_0=0\), it can be observed that
The addition of these constraints excludes the possibility of
cycles not including node 0. We may also notice that it is not
required to impose integrality constraints on the variables
\(\var{y}\); in fact they might also be non integer, if
necessary. In the above model, as the source is assigned label 0,
the highest possible label is \(n\) and the increment is one,
it turns out that in any feasible solution the values of variables
\(\var{y}\) will be consecutive integers, i.e., a sequential
numbering of the visited nodes.
In the following we give a formulation of the model, using the
formulation just introduced:
The following figure reports the optimal tour (based on estimated
travel times):
Some comments are in order. First we can notice from the graph
above that the path “crosses itself”, which is something which
might be considered as negative at first sight. But it should be
reminded that the arrows represented in the figure are not
representative of the actual path followed by the driver, but just
are meant to represent the correct visit sequence. As an example,
the apparent crossing close to Cutigliano is not real, as the best
path San Marcello Pistoiese to Abetone and then Piteglio, goes
through Cutigliano, and similarly for other apparent crossings.
Another observation is the speed by which this problem is
solved. Indeed this is a small instance. However, the fact that,
with this formulation, it was solved in a very short CPU time, is
not due to the formulation, but to the advanced optimization
software employed. The solver used in the example automatically
adds a large number of cuts (valid inequalities) to improve the
formulation, which, in itself, is very weak. If we used, e.g., the
CBC solver, which is already a very advanced implementation, the
same instance would be solved in 17 seconds, with over 150000
simplex iterations. In fact, the formulation just presented does
not allow to solve any problem of size, say, larger than 50.
Different formulations will be introduced later in this chapter.
Although “weak” from the polyhedral point of view, the constraints just presented are
important as they allow for the formulation of additional
constraints, useful in many applications. In fact, in place of the unit
increase of variable \(\var{y}_j\), an increase equal to, say,
the travel time along arc \((i, j)\), or the sum of the time
spent in node \(i\) plus the travel time on the arc. In this
case, variable \(\var{y}_j\) would have the meaning of
the arrival time at node \(j\). Having this variable in the model
easily allows to insert temporal constraints like, e.g. requiring
that a node must be visited in a specific time window, or requiring
that a node is not visited before another one. An application of
models of this kind can be considered, as an example, for the
scheduling of home care visits to patients by a
medical or nursing staff. In order to implement a model of
this kind, the logical constraints need to be slightly modified.
The lower bound on the difference of
visit times of two consecutive nodes must be computed as a function
of the travel and visit times. For example, we might
assume that there are an earliest start and return times, denoted
by \(\param{T_{\min}}, \param{T_{\max}}\).
Assume furthermore that visiting a node requires
\(\param{T}_i\) time units
and that the travel time between two nodes is
\(\param{T}_{ij}\). Then
the logical constraint becomes
The TSP problem has been the subject of very intense research in
all recent years; there exist many different formulations of the
subtour elimination constraints (see, e.g., [26] for a
list of 24 different models). One of the best known formulations
directly proceeds
without the need for any additional labeling.
Let \(\set{S}\) be any not empty subset of the set of nodes
which does not coincide with the set of nodes: \(\set{S} \subset V,
S \ne \emptyset\). In order to prevent any tour within the set
\(\set{S}\) it is sufficient to require that at least one arc in
the solution exits (or enters) this set. This can be done through
the constraint:
Such a constraint should theoretically be imposed for
each set of non empty subset of the set of nodes,
with the exception of the whole set. However, it is easily seen
that, for symmetry, imposing the constraint for a set is
equivalent to imposing the same constraint to the complementary
set, so that we can limit the set \(\set{S}\) to have a
cardinality which is no more than one half the number of
nodes. Moreover, when the set is a singleton, the constraint is
already included in the model as an assignment constraint. Thus,
the required sub-tour elimination constraints are:
Despite this reduction, the number of constraints to be added
to the formulation is astronomical and grows exponentially fast as
the size of the problem grows. The total number of constraints
remains \(O(2^{|V|})\). It is therefore impossible
to add all of these constraints in a formulation of the
problem. On the other hand it can be proved that these constraints are
rather strong and their addition to the original formulation
would lead to a model whose linear relaxation is quite good.
A possibility, exploited in most recent implementations,
is to dynamically insert only those inequalities which are violated
in the solution of current relaxation.
This is done in an automatic way within so-called
Branch and Cut implementations. The general idea is to
solve an initial relaxation
of the problem, and then calling a procedure which determines if there exists
any of the sub-tour elimination constraints which is violated.
If the current solution is binary, this task is quite easy, as it
just requires to identify cycles in the current solutions: the
nodes of any cycle which does not include the source node form a
set \(\set{S}\) associated to a violated inequality which can
be added to the formulation. More complex is the case in which the
current solution is not binary: a separation procedure needs to be
implemented which is capable of finding a violated inequality which
cuts off the current solution. This topic is out of the scope of
this volume. Just as an illustrative example, if in the example
used before we drop the labeling constraints and just solve the
relaxed problem with only the constraints requiring one unit flow
in and out of each node, the following solution is obtained:
This solution, as it can be immediately seen, contains various
sub-tours. Its total cost is 192.23, which is thus a lower bound on
the optimal solution value. Adding to the formulation
only the sub-tour elimination constraints
associated to the 10 sub-tours in this solution, and solving the
problem again, the following solution is found, which is again
infeasible, but starts to get close to a single tour:
This process can be continued and eventually the optimal solution
is found; this happens well before having added all of the possible
sub-tour elimination constraints which, in this example, would be
slightly more than \(2\,000\).
The applications of the traveling salesman model are
many. Obviously, the main application field is in transportation
logistics, where the model can be applied to sequence deliveries,
garbage collection, home care visits, …
Problems connected with street cleaning or snow removal are only apparently similar:
in this case it is required that each arc is visited at least once,
while in the traveling salesperson each node is visited at least
once. This problem will be shortly dealt with later on.
Among the many applications, it is important to recall in particular those related to
industrial production.
It is quite common, in production, that some jobs can be worked on a
machine, or a plant, in any arbitrary ordering, within, say, a single
production period (a day, as an example). However, it is also
frequently the case that after a job has been completed, before
starting the next one some setup is needed, and production needs to be
stopped. Frequently, this setup time depends both on the job just
concluded and on the next one. Consider, as an example, a drilling
machine: after drilling a hole in a plate, it is necessary to change
the tool in order to drill holes of different sizes: this retooling
requires a time which depends on the relative sizes of the two
different tools to be mounted and dismounted from the machine.
In a different environment, consider the
problem of establishing the optimal sequence of surgical operations
in an hospital: after an operation ends, and before the following one
can start, some time is needed for sterilization and for changing the
equipment in the operating room. These setup actions have a duration
which might depend on the two surgeries involved. In these examples,
the nodes to be “visited” are the different industrial or surgical
operations; the arcs connect different operations which might be
scheduled in sequence. The cost of each arc is the amount of time
wasted in setup. The traveling salesperson solution allows us to find
how to sequence all of the operations so that the total time spent in
setup is minimized. This has great benefits in terms of productivity,
in the first case, and, in the hospital case, allows the planner to
save time on setup and have more time allowed for surgeries.
application:
Sequencing in industrial production
Consider the problem of sequencing operations in a
machine used to color a material or in a textile loom where
different colors are needed for each product. Assume a list
of jobs ready to be processed is available, each characterized by a
set of colors. The plant is organized in such a way that, when a
job is finished and the next one has to be prepared to start, some
colors need possibly to be changed. This change requires a setup
time. If colors are liquid, it might be necessary to clean the
containers and to fill them with a new color, or to extract the
color container and insert a new one. In textile industries, color
changes require dismounting a set of threads and substituting them
with another set. In general, the time wasted in this operation
depends on the number of colors changed. In painting, it might also
depend on the colors themselves, as cleaning some colors from a
container might require different times depending on the color
type. In this latter case, the graph of the TSP is asymmetric.
Consider as an example the case of coloring tissues in 4 colors,
and assume that the jobs to be worked on a day are represented by
the following figure:
In this figure the standard is to process jobs in the order given
by reading the table from left to right, top down. At the left of
each job a number denotes the number of colors to be changed before
that job begins. For the first job this number is 4, as,
considering the last job, all of the colors need to be changed. The
second job in this sequence has a setup cost 2, as it can be seen
that from job 1 to job 2 only two colors have to be changed, while
blue and brown can be left unchanged. It should be clear how to
compute setup times for all pairs of jobs. Using this sequence,
which was built using a lexicographic ordering of job colors, a
total of 35 color changes is needed.
If we run the TSP model with the data corresponding to this job
set, the following sequence of jobs is obtained:
This sequence requires only 24 color changes, with a saving of more
than 30% of the setups.
In the examples we always assumed an initial state is known and that
at the end of the tour, the same state needs to be restored. In the
geographical TSP, the start city, or base station, is known and the
tour should terminate in that node; in production or in surgery, we
assume the initial setup is given and that at the end of the day the
same setup needs to be restored. If this assumption were false,
we could easily extend the model to include any
start and end node, possibly different one from the other.
If the initial node coincides with the final one, then clearly
choosing any different starting node, the same tour will remain
optimal.
If instead we do not know the initial and final nodes, and they are
different, we might
insert a dummy node (a repository) connected to every other node in
the graph (in both directions) with zero cost arcs.
This part on the TSP problem is necessarily too short, as this is one
of the most deeply explored problems in Optimization and Operations
Research. We hope we have given at least the feeling on how relevant
this problem is and of how many practical problems can be profitably
solved through a model like this one. We did not mention many other
possibilities, like, e.g., organizing the tour of an automatic
machine for the production of VLSI (Very Large Scale Integrated)
circuits: here the machine needs to place thousands of components on a
chip, and the TSP tour could save a significant amount of time in
moving the tool from one location to another one. A very large set of
examples, software, historical notes on the TSP problem can be found
in the Concorde web site [12] which is devoted to
the TSP problem and offers among the best software tools for the
solution of large instances.