The computational complexity class, NP-hard, is at the core of a number of problems we encounter on a daily basis, from loading the dishwasher (how do I get all these pots to fit?) to packing a car for a vacation, to putting together a child’s train tracks.

If we look at these things, they have several things in common. First, they each involve a potentially large number of parts (pots, luggage, pieces of track) that need to be put together in some way. Second, we want to meet some objective, such as fitting all the dishes in the dishwasher. Third, there are a large number of constraints that must be met. In the case of loading the dishwasher, no two dishes can be put in the same place. There are N^2 constraints just to specify this, among many others. A fourth characteristic is that we may get close to an optimal solution, but find it difficult and not obvious how to get to a more optimal one (just how are we going to fit that last pot in the dishwasher). Furthermore, getting from a near optimal solution to an optimal one may involve a complete rearrangement of all the pieces.

One way to solve problems like packing a dishwasher is to view it as a truth table. Each dish can be put in one of, say, 100 slots, in, say, one of ten different orientations. This results in 1000 combinations, requiring 10 bits. If there are 40 dishes, 4000 bits are required to represent all possible configurations of dishes in the dishwasher. The resulting truth table is vast. Each entry in the table indicates how much space is left in the dishwasher if dishes are put in according to the configuration of that entry. A negative number indicates an infeasible solution. There will be many invalid configurations which have two or more dishes occupying the same location. We give all of these entries a large negative number.

The resulting table describes a landscape that is mostly flat with hills sparsely scattered throughout. We can also imagine that this landscape is an ocean in which negative values are under water and positive values represent islands in the ocean. The goal is to find the highest island in the ocean. We start in some random location in the ocean and start searching. We may find an island quickly, but it may not be the highest one. Given the vastness of the ocean, it is understandable why it can take a very long time to find a solution.

But, wait a minute. What about polynomial algorithms like sorting? A truth table can be constructed for these also. For example, to sort 256 elements, we can create 8 bit variables for each element to describe the position of that element in the sorted list. The value of each entry would indicate the number of sorted elements for that configuration. The complete table would again be around 4000 bits and have vast numbers of infeasible solutions in which two or more elements occupy the same slot in the list and only one satisfying solution. Yet, we know finding a solution is easy. Why is this?

The ocean corresponding to the sorting problem is highly regular. If we are put down in an arbitrary point in the ocean, we can immediately determine where to go just be examining the current truth table entry (point in the ocean). Knowing the structure, we may be able to determine from this that we need to go, say, northeast for 1000 miles. We may have to do this some number (but polynomial) times before getting to the solution, but is guaranteed to get to the solution. Structure in a problem allows us to eliminate large parts of the search space efficiently.

In contrast, for an NP-hard problem, there is no guarantee of structure. Furthermore, as we are sailing around this ocean, we are doing so in a thick fog such that we can only see what is immediately around us. We could sail right by an island and not even know it. Given this, it is easy to see that it could take an exponential amount of time to find a solution.

But then, how do we account for the fact that, often, NP-hard problems are tractable? The answer to this question is that there usually is some amount of structure in most problems. We can use heuristics to look for certain patterns. If we find these patterns, then this gives guidance similar to the sorting example above. The problem is that different designs have different patterns and there is no one heuristic that works in all cases. Tools that deal with NP-hard problems usually use many heuristics. The trouble is that, the more heuristics there are, the slower the search. At each step, each of the heuristics needs to be invoked until a pattern match is found. In the worst case, no pattern match will be found meaning it will take an exponential time to do the search, but the search will be much slower due to the overhead of invoking the heuristics at each step.

I hope this gives some intuition into NP-hard problems. In future posts I will talk about even harder classes of problem.

## One Comment

Thanks for making it absolutly easy to understand. The ocean analogy is a very good analogy. I Liked it.

Keep Posting 😉

## One Trackback/Pingback

[…] that uniformly randomizing across a constrained input space is a NP-hard problem. As we have seen before, NP-hard problems are optimization problems. Uniformity of randomization is the optimization goal. […]