Skip navigation

Category Archives: complexity

On  August 6, Vinay Deolalikar, a researcher at HP Labs in Palo Alto, released a paper with a proposed proof for the open problem P != NP. Somehow, news of this proof made it to the mainstream press with headlines proclaiming that the problem was solved. Problem was, the computational complexity community not only had not had to time to review the proof, but viewed it with great skepticism.

The significance of this is that proving P != NP is a grand challenge problem. You will hear that if it turned out the other way, that is, P = NP, then all known cryptographic systems would become easily breakable. On the other hand, it would mean that many verification and optimization problems would suddenly become tractable. More troubling, P=NP would also would mean that Skynet is just around the corner. Most researchers already believe that P != NP, so there is really no immediate effect from producing a valid proof of P!+NP.

The P!+NP problem has been deemed so important to solve that whoever solves it either way will receive a $1M prize from the Clay Mathematics Institute. Before this happens, though, the proof would need to be generally accepted by the research community as being correct, a process that will probably take months, if not years.

An extraordinary thing happened in response to the release of the proof. A number of experts in the field dropped what they were doing and started poring over the proof in earnest. This is no small feat given that the proof is over 100 pages long. Pretty quickly, a consensus was reached that there were a number of serious flaws, although, as far as I can tell, no smoking gun yet. For his part, Deolalikar has not conceded defeat yet and is working on addressing all issues that were found.

The proof and mathematics involved are  beyond my pay grade. But, the most interesting part of this episode for me is the sociological issues brought up. A group of the top experts in the world dropping everything to review a paper in a matter of days is highly unusual.  Unprecedented is that fact that the discussion happened entirely out in the open in online forums. This is in contrast to the normal peer review process, which, for journals, can stretch to years, and is often done by only a few anonymous reviewers who generally don’t share their discussion outside the editorial board (or conference program committee as the case may be).

The discussion could be followed by anyone interested. Now, it is not clear that this is a scalable model as is: it is not possible to drop everything at a moments notice for every paper that comes along. This could have the negative effect of, the next time this happens due to the crying wolf effect. On the positive side, the openness of the process allowed the computational complexity to come together and get in sync on this problem. Many of the objections raised were “standard” objections. For example, the proof may prove other things that are known to be false. This is something that researchers routinely check for when reviewing a paper.

Researchers have an, essentially, agreed upon set of “barriers” to a proof of P!=NP. They also point to a number of results (theorems) that any valid proof would need to address. Any proof in the future (assuming Deolalikar’s doesn’t work out) would need to address these barriers. Putting all this discussion in the open, hopefully, means that proposed proofs in the future would be much higher quality ( P!=NP (and some P+NP) “proofs” are produced  on a regular basis).

The other thing that is clear from this episode is that anyone who is going to receive the $1M prize needs the blessing of the computation complexity community. The most interesting barrier that I learned about is a theorem that essentially says “proofs of P!=NP are hard because P!+NP”. All of the other barriers also imply that any proof must be hard. No one is going to come up with a short, elegant, easy-to-understand proof for this. This means that a proof is highly unlikely to come from an outsider. Deolalikar is an outsider, but managed to meet enough of the community requirements to be taken seriously. While researchers acknowledge the contributions of outsiders, the nature of this problem makes it highly unlikely to come from an outsider.

In the (likely) event that Deolalikar’s proof ends up being incorrect, years from now, when a proof is produced, I think we will look back at this event and see that it was a step forward because of the openness of the discussion. Here’s hoping that this leads to positive improvements in how scientific review is done. I think we should be grateful to Deolalikar even if his proof doesn’t work out and even if it represents no new insight into the P!=NP problem.

For further reading:

bbc news (an easy to understand article blessed by an expert in the field) (where most of the discussion took place)

wiki (gory details. warning: heavy math)


I was listening to a talk by Daniel Kroening, a software verification researcher at Oxford University , who was explaining the certification process for safety-critical software. He mentioned that one of the requirements is that all test cases must be verified on the actual hardware. Now, in the case of avionics software, that means one test flight for every test case. Since test flights are expensive, optimizing the number of test cases required to cover everything  is extremely important.

This is one of these cases where you get a dot and you connect them with other seemingly random dots and see that you have a line. In this case, what I realized is that the difficulty in verification is not really finding bugs (bugs are easy, right?), but in how efficiently we find these bugs. Recently I posted on how constrained random testing is essentially a (hard) optimization problem. I also posted on the best verification methodology being to combine orthogonal methodologies in order to optimize bug finding productivity. The criticality of optimizing safety-critical test cases was another data point  that led me to this  realization.

This is reflected in the fact that many of the most successful verification tools introduced over the last twenty years have succeeded by optimizing verification productivity. As we all know, faster simulation really does very little to improve the quality of a design, but it helps enormously in improving verification productivity. Hardware verification languages are probably the second most important development in verification in the last twenty years. But again, they don’t improve quality, they simply improve productivity in developing verification environments.

This is not to say that there have not been tools that improve quality. Formal proof clearly improves quality when it can be applied, although semi-formal verification, which focuses on bug hunting is more of a productivity increase. In-circuit emulation allows you to find bugs that could not be found in simulation due to being able to run on the real hardware. However, emulation used simply as faster simulation is really just a productivity increase.

Is verification optimization related to the well discussed verification bottleneck (you know, the old saw about verification consuming 70% of the effort)?  Verification became a bottleneck when the  methodology changed from being done predominantly post-silicon to being done predominately pre-silicon. Many people saw the resulting dramatic increase in verification effort as being correlated to increased design size and complexity. If this were true, then verification would consume 98% of the effort today since this switch occurred twenty years ago and there have been many generations of products since then. Since relative verification effort has not changed significantly over the last twenty years, I think it is safe to say that verification effort is constant with increased design size and that differences in relative effort reflect differences in methodology more than anything.

The real question is: will verification optimization become more important in the future as designs become larger and will that result in relative verification effort rising? If there is no change to design methodology, we would expect verification optimization effort to remain constant.  If high-level synthesis allows us to move up the abstraction ladder, this should improve the ability to optimize verification. In short, there does not seem to be a looming crisis in overall verification optimization.

However, if we look at the software side we see that software content on hardware platforms is growing rapidly which is putting enormous pressure on the ability to verify this software. Effectively, we have managed to forestall the hardware verification optimization crisis by moving it to software

Constrained random testing is the predominant method used to verify chips today. It has two advantages over directed testing. First, randomness allows testing things not explicitly thought of by the verification engineer or designer. Second, automated random testing is more productive than manual directed testing in terms of number of test vectors produced as a function of human effort put in.

There are several variants on the basic concept of random testing. Unconstrained random testing generates random values for all input variables without any consideration of legality. Constrained random testing is used when the legal set of input values is a subset of the entire input space. In the last dozen years, hardware verification languages (HVLs) have enabled the use of constraint-based random testing. Constraint-based random testing uses static constraints and constraint solvers to do constrained random testing.

Constraint-based random testing improves productivity in writing constrained random testbenches because it allows writing tests at a higher level of abstraction. Static constraints specify the legal input space, but not how to randomize values; this is left up to the solver. As designs/protocols have become more complex, constraint-based random testing’s productivity gain has increased. As a result, there is a proliferation of large, complex constraint sets in today’s testbenches.

Increasingly, however, solving these large, complex constraint sets has become a bottleneck in simulation due to the large amount of time spent solving these constraint sets. Consequently, it is common that a lot of time is spent debugging and trying to workaround these solving bottlenecks. In the worst case, it is necessary to discard the static constraints and rewrite the randomizer using imperative code, negating the productivity gains of using static constraints.

The fundamental cause of this problem is the fact that uniformly randomizing across a constrained input space is a NP-hard problem. As we have seen before, NP-hard problems are optimization problems. Uniformity of randomization is the optimization goal. For example, suppose we have a simple constraint: 0 <= X <= 10. If we randomize X a number of times and each time it returns the value 0, this meets the constraint, but is not very uniform. Randomization that produces the value 0 to 9 with equal probability is optimal. Because NP-hard problems have the additional burden of requiring an optimal value, not just a satisfying value, they are generally harder to solve than NP-complete problems.

What are solutions to this problem?

The obvious solution is to build better solver/randomizers. Companies that produce HVLs already spend a large amount of effort trying to improve their solvers. If it were that easy, it would have been done already.

Barring better solvers, the only other choice is to change methodologies. One change is to ditch static constraints and go back to writing imperative code with the ensuing reduction in productivity. Not a pleasant thought to contemplate.

An alternative to this is keep the static constraints, but reduce the complexity from NP-hard to NP-complete. This means dumping uniform distribution for, potentially, very poor, but still legal, randomization. This is feasible considering that the real goal of random testing is to find bugs and improve coverage. We believe the uniform distribution is required to do this. However, if the randomizer knew precisely what values to generate in order to find a bug or hit a coverage point, it could just generate these values, which may or may not be uniform. If it were possible to do this, randomization would go from being an NP-hard problem to being a NP-complete problem. Technologies such as Nusym’s path tracing technology allow this.

One thing I think would help to implement this strategy would be to provide an open solver API in today’s HVLs. This would allow plugging in different solvers for different circumstances and developing custom solvers on a case-by-case basis rather than being forced to have a one-size-fits-all solver.

In an article in a recent issue of Computer entitled “Really Rethinking Formal Methods”, David Parnas questions the current direction of formal methods research. His basic claim is that (stop me if this sounds familiar) formal methods have too low ROI and researchers, rather than proclaiming the successes, need to recognize this and adjust their direction. As he so eloquently puts it:

if [formal methods] were ready, their use would be widespread

I haven’t spent a lot of time trying to figure out if his proscriptions make sense or not, but one thing stood out to me. He talks about a gap between software development and older engineering disciplines. This is not a new insight. As far back as the 60’s, the “software crisis” was a concern as the first large complex software systems being built started experiencing acute schedule and quality problems. This was attributed to the fact that programming was a new profession and did not have the rigor or level of professionalism of engineering disciplines that had been around for much longer. Some of the criticisms heard were:

  • programmers are not required to have any degree, far less an engineering degree.
  • programmers are not required to be certified.
  • traditional engineering emphasizes using tried and true techniques, while programmers often invent new solutions for every problem.
  • traditional engineering often follows a rigorous design process, programming allows hacking.

These explanations are often used as the excuse when software (usually Microsoft software) is found to have obvious and annoying bugs. But is this really the truth? Let’s look at an example of traditional engineering to see if if this holds up.

Bridge building is technology that is thousands of years old. There are still roman bridges built two thousand years ago that are in use today. Bridges are designed by civil engineers who are required to be degreed, certified engineers. Bridge design follows a very rigorous process and is done very conservatively using tried and true principles. Given that humanity has been designing bridges for thousands of years, you would think that we would have gotten it right by now.

You would be wrong.

Even today, bridges are built with design flaws that result in accidents and loss of life. One could argue that, even so, the incidence of design flaws is far less in bridges than in software. But this is not really an apples to apples comparison. The consequences of a bug in, say, a web browser are far less than a design flaw in a bridge. In non-safety critical software, economics is a more important factor in determining the level of quality of software. The fact is, most of the time, getting a product out before the competition does is economically more important than producing a quality product.

However, there are safety critical software systems, such as airplanes, medical therapy machines, spacecraft, etc. It is fair to compare these systems to bridges in terms of catastrophic defect rates. Let’s look at one area in particular, commercial aircraft. All commercial aircraft designed in the last 20 years rely heavily on software and, in fact, would be impossible to fly if massive software failures were to occur. Over the past 20 years, there have been roughly 50 incidents of computer-related malfunctions, but the number of fatal accidents directly attributed to software design faults is maybe two or three. This is about the same rate of fatal bridge accidents attributable to design faults. This seems to indicate that this gap between software design and traditional engineering is not so real.

The basic question seems to boil down to: are bridges complex systems?  I define a complex system as one that has bugs in it when shipped. It is clear that bridges still have that characteristic and, therefore, must be considered as complex systems from a design standpoint. The intriguing question is, given that they are complex systems, do they obey the laws of designing complex systems? I believe they do and will illustrate this by comparing two bugs, one a bridge design fault and another a well known software bug.

The London Millennium Footbridge was completed in 2000 as part of the millennium celebration. It was closed two days after it opened due to excessive sway when large numbers of people crossed the bridge. It took two year and millions of pounds to fix. The bridge design used the latest design techniques, including software simulation to verify the design. Sway is a normal characteristic of bridges. However, the designers failed to anticipate how people walking on the bridge would interact with the sway in a way to magnify it. The root cause of this problem is that, while the simulation model was probably sufficiently accurate, the environment, in this case, people walking on the bridge, was not accurate.

This is a very common syndrome in designing complex hardware systems. You simulate the chip thoroughly and then when you power it up in the lab, it doesn’t work in the real environment. I describe an example of this exact scenario in this post.

In conclusion, it does seem that bridges obey the laws of designing complex systems. The bad news is that the catastrophic failure rate of safety-critical software is of roughly the same magnitude as that of bridges. This means that we cannot expect significant improvements in the quality of software over the next thousand years or so. On the plus side, we no longer need to buy the excuse that software development is not as rigorous as “traditional” disciplines such as building bridges.

We have seen in previous posts that design and verification problems fall in the class of computationally complex hard problems. That is, in the worst case, it takes O(2^N) time to optimize/verify a design of size N, where N could represent the number of binary inputs. If we run into this limit, our ability to design large systems is severely limited.

Given these limits, then, how it is possible that we can routinely create designs consisting of millions of lines of code or gates? More importantly, how we can continue to design increasingly larger, more complex designs?

One answer is that we can’t. We can’t find all the bugs in complex designs, but we can hope that there are no critical bugs. We can’t perfectly optimize a design, but we can do a good enough job. Even then, exponential complexity should be a brick wall to increasing complexity, but is not. Somehow, we are able to avoid these limits.

One way to look at this is to consider the way human brains are built and work in creating complex designs. The jumping off point for understanding this is Shannon’s counting argument.

Claude Shannon was one of the most important computer scientists of the twentieth century. He is most famous for his work on information theory. Shannon’s counting argument is one of his less well known results. Shannon was trying to answer the question, what is the minimum number of gates (say two-input NAND gates) needed to be able to implement any function of N binary inputs? The basic answer is that there must exist functions that require an exponential number of gates to implement. The proof goes by counting the number of possible functions of N inputs and counting the number of functions implementable with T gates and showing that T is o(2^N). The argument is as follows:

  • the number of functions of N inputs is 2^(2^N). For example, a truth table for N=4 would have 16 entries. The number of functions, therefore is 2^16=16,384.
  • The number of functions of N inputs that can be implemented with T 2-input gates is (N+T)^(2T). Each gate input can connect to the output of any other gate or an input for a total of (N+T) possible connections and there are 2T gate inputs.
  • To implement any possible function, then, we need (N+T)^2T >= 2^(2^N).
  • Solving for T shows that T >= 2^(N-d) for some small delta d, which implies that T = o(2^N).

What is the significance of this? Today’s chips have as many as 1000 pins. What is the minimum number of gates required on the chip that would allow us to implement any function of a thousand inputs? Is, say, one million gates sufficient? Off the top of your head, you probably would say no. What about a billion? Our preconceived notion is that this should be sufficient. After all, that is right at the limit of today’s most advanced chips and it is inconceivable that we couldn’t implement what we want with that many gates.

Now lets look at what the counting argument has to say about this. The total number of possible functions is 2^(2^1000), which is a vast number, roughly 10^(10^300). A billion 2-input NAND gates would allow us to implement roughly 10^(10^10) different functions with a thousand inputs. This is also a vast number, but is infinitesimal compared to the number of possible functions. The fact is, we are limited to a very small number of functions that can be implemented!

And this picture doesn’t fundamentally change even if we assume we have a trillion gates at our disposal. So, how is it we believe that we can implement whatever functionality we need? To answer this, let’s work backward from the number of gates available and determine the maximum number of inputs for which we can implement any arbitrary function. Let’s choose the most complex system we have at our disposal, the human brain.

The human brain contains on the order of a trillion neurons, each with roughly a thousand connections. Assuming all neurons can be connected to any other neuron (which is not the case), we come up with an upper bound on the number of functions that can be implemented: (10^10)^(1000 * 10^10) = 10^(10^14) = 2^(2^40). In the best case, therefore, the human brain could implement any function of just 40 binary inputs.

Now we can do a thought experiment. What is the largest size truth table that you can mentally make sense of at one time? Two bits is easy. Three bits not so hard. Four bits is difficult to visualize as a truth table, but using visual aids such as Karnaugh maps makes it possible.

Another test would be to look at the placement of arbitrarily placed objects on a plane. Given a glance, how accurately could you replicate the placement of all objects. If there were only one object, I could envision determining placement with an accuracy of 5-6 bits in both X and Y dimensions. More objects with less accuracy. Beyond about seven objects , it is is difficult to even remember exactly how many objects there are. Maybe 10-12 bits accuracy overall is possible.

The bottom line is the human brain has a very low capacity for consciously grasping an arbitrary function. Let’s say its no more that 10 bits in the best case. From the counting argument, we can determine that it requires slightly more than 500 gates to implement any arbitrary function of 10 inputs. With a million gates on a chip, we could put 20000 such functions on the chip.

Now we treat each of these 500 gate blocks as a unit and then consider how to hook these 20000 blocks together to perform some function. This is certainly possible using abstraction. In this case it would probably require two or more levels of abstraction to be able to deal with 20000 blocks.

Abstraction is the essence by which we are able to design very large systems.  The need for abstraction is driven by the limitations on our brain in dealing with large functions imposed by Shannon’s counting argument. And it is the structure imposed by multiple layers of abstraction that makes intractable design and verification problems become tractable.

[Note: this is another in my series on computational complexity for dummies.]

The field of Artificial Intelligence (AI) got its start in 1950 with the famous Turing test. It was thought at that time that, by the year 2000, computers would possess  human-level intelligence. Well, this didn’t come to pass and human-level intelligence still seems as far off now as it did then.

One metric back then was the game of chess. It was thought that for a machine to beat the best human at chess would require human-type thinking. This metric was met in 1997 when Deep Blue beat Gary Kasparov. However, as the designers of Deep Blue freely admit, Deep Blue won mostly by brute force enumerating all the solutions. it was clear, and had been for a while, that the chess problem was going to be solved by exploiting computers vastly faster speed compared to humans rather than by any form of human-type thinking.

Chess is a particular type of game in which there are two players, with one player moving first, the second making a counter-move, and the players alternating until one player wins or a draw is achieved. The goal is to find a winning strategy. We can formulate this as follows (for player 2 in this case):

for all first moves player 1 makes,
there exists a first move that player 2 makes such that,
for all second moves player 1 makes,
there exists a second move that player 2 makes such that,
player 2 wins.

The computationally complexity of problems with this structure (alternating “for all” and “there exists” quantifiers) is in the class PSPACE-complete. Recall that the NP-hard class is the set of problems that tend to be hard. The NP-complete class is a subset of the NP-hard class and these tend to be the easiest problems to solve in the NP-hard class. PSPACE-complete problems are the hardest subset of the NP-hard class. In general, there is no easier way to solve them than by enumerating all the solutions until a satisfying one is found. This is why research into chess programs focusses on making search speed as fast as possible and why it takes specialized hardware to beat a human.

But, what has this got to do with verification? At the most abstract level, a verification system consists of two parts: the device-under-test (DUT) and the environment it sits in. The DUT doesn’t do anything unless the environment initiates something by injecting a request. The DUT responds to this in some way. This is analogous to a two player game in which the DUT is one player and the environment is the other. Requests and responses correspond to moves and counter-moves. The question then is: what is analogous to a game in verification?

There are two fundamental categories of properties that all verfication checks fall into: safety properties and liveness properties. A safety property is one that says “nothing bad will happen” and a liveness property says “something good will always happen.” An example of a safety property is that a traffic light should never be green in both directions simultaneously. An example of a liveness property is that, for a given direction, the light should always eventually turn green.

The vast majority of properties that we verify on a day-to-day basis are safety properties. For example, the property, 2+2=4, in an adder is a safety property. It is not possible to prove any type of property using simulation alone. It is possible to demonstrate the existence of safety property violations, but not liveness property violations using simulation alone. For example, suppose we simulated the traffic light controller for a million time steps and the light was always red in one direction. Does this constitute a violation of the liveness property? It is not conclusively a violation because it is possible the light could turn green if we simulated one more time step. At the same time, even if we saw the light turning green, this does not mean that it won’t get stuck red at some point.

It is possible to prove liveness properties using formal verification. How would we formulate the liveness property of the traffic light? One way is to consider it as a game between the cars arriving at the light (the environment) and the traffic light state. Then we could formulate the problem:

for all possible combinations of cars arriving at step 1,
there exists a state of the traffic light at step 1 such that
for all possible combinations of cars arriving at step 2,
there exists a state of the traffic light at step 2 such that
if the traffic light state is red, 
                     then in the next state it will be green.

This has the same structure as a game and so is PSPACE-complete. Since PSPACE-complete problems are solveable to a certain degree, we can prove liveness properties, but generally only on very small designs. The most effective use of this is to verify highly abstracted designs, typically when doing protocol verification.

This deep connection between the formal verification field and AI field goes further than having similar complexity. The core tool used in formal verification today, SAT solving, emerged from research into AI problems such as planning. The cross fertilization between these two fields has yielded dramatic improvements in efficiency in SAT solving over the last 15 years.

The computational complexity class, NP-hard, is at the core of a number of problems we encounter on a daily basis, from loading the dishwasher (how do I get all these pots to fit?) to packing a car for a vacation, to putting together a child’s train tracks.

If we look at these things, they have several things in common. First, they each involve a potentially large number of parts (pots, luggage, pieces of track) that need to be put together in some way. Second, we want to meet some objective, such as fitting all the dishes in the dishwasher. Third, there are a large number of constraints that must be met. In the case of loading the dishwasher, no two dishes can be put in the same place. There are N^2 constraints just to specify this, among many others. A fourth characteristic is that we may get close to an optimal solution, but find it difficult and not obvious how to get to a more optimal one (just how are we going to fit that last pot in the dishwasher). Furthermore, getting from a near optimal solution to an optimal one may involve a complete rearrangement of all the pieces.

One way to solve problems like packing a dishwasher is to view it as a truth table. Each dish can be put in one of, say, 100 slots, in, say, one of ten different orientations. This results in 1000 combinations, requiring 10 bits. If there are 40 dishes, 4000 bits are required to represent all possible configurations of dishes in the dishwasher. The resulting truth table is vast. Each entry in the table indicates how much space is left in the dishwasher if dishes are put in according to the configuration of that entry. A negative number indicates an infeasible solution. There will be many invalid configurations which have two or more dishes occupying the same location. We give all of these entries a large negative number.

The resulting table describes a landscape that is mostly flat with hills sparsely scattered throughout. We can also imagine that this landscape is an ocean in which negative values are under water and positive values represent islands in the ocean. The goal is to find the highest island in the ocean. We start in some random location in the ocean and start searching. We may find an island quickly, but it may not be the highest one. Given the vastness of the ocean, it is understandable why it can take a very long time to find a solution.

But, wait a minute. What about polynomial algorithms like sorting? A truth table can be constructed for these also. For example, to sort 256 elements, we can create 8 bit variables for each element to describe the position of that element in the sorted list. The value of each entry would indicate the number of sorted elements for that configuration. The complete table would again be around 4000 bits and have vast numbers of infeasible solutions in which two or more elements occupy the same slot in the list and only one satisfying solution. Yet, we know finding a solution is easy. Why is this?

The ocean corresponding to the sorting problem is highly regular. If we are put down in an arbitrary point in the ocean, we can immediately determine where to go just be examining the current truth table entry (point in the ocean). Knowing the structure, we may be able to determine from this that we need to go, say, northeast for 1000 miles. We may have to do this some number (but polynomial) times before getting to the solution, but is guaranteed to get to the solution. Structure in a problem allows us to eliminate large parts of the search space efficiently.

In contrast, for an NP-hard problem, there is no guarantee of structure. Furthermore, as we are sailing around this ocean, we are doing so in a thick fog such that we can only see what is immediately around us. We could sail right by an island and not even know it. Given this, it is easy to see that it could take an exponential amount of time to find a solution.

But then, how do we account for the fact that, often, NP-hard problems are tractable? The answer to this question is that there usually is some amount of structure in most problems. We can use heuristics to look for certain patterns. If we find these patterns, then this gives guidance similar to the sorting example above. The problem is that different designs have different patterns and there is no one heuristic that works in all cases. Tools that deal with NP-hard problems usually use many heuristics. The trouble is that, the more heuristics there are, the slower the search. At each step, each of the heuristics needs to be invoked until a pattern match is found. In the worst case, no pattern match will be found meaning it will take an exponential time to do the search, but the search will be much slower due to the overhead of invoking the heuristics at each step.

I hope this gives some intuition into NP-hard problems. In future posts I will talk about even harder classes of problem.

Problems that we use computers to solve can be divided into three basic classes:

  • easy to solve
  • hard to solve
  • impossible to solve

Pretty much all the algorithms we use day-to-day fall into the easy class. Text editing, web browsing, searching, even problems such as weather prediction, which run on supercomputers, are examples of easy to solve problems. These problems are solvable with algorithms that are called polynomial-time (P). This means that, given a problem size of N, an algorithm exists to solve the problem in time that is proportional to a fixed power of N, written as O(N^k). For example, sorting a list of N words is solvable in O(N^2) time.

The class of hard-to-solve problems pretty much includes all problems involved in the creation and verification of a design. Synthesis, place-and-route, formal verification, all fall in this class. You  probably have heard the term NP-complete. NP-complete is just one class of many in the hard-to-solve category. The class of NP-complete problems is the easiest of the hard classes, which means that, often, these types of problems can be solved reasonably well. The hardest of the hard-to-solve classes that you are likely to encounter  is the PSPACE-complete class. Problems in this class are, for all intents of purposes, intractable. We will look at these two classes, plus another one that occurs frequently in design problems, the NP-hard class.

Both the easy and hard-to-solve classes are at least theoretically possible to solve. The last group, the impossible-to-solve group, consist of the set of uncomputable problems. These problems mostly are of theoretical interest, which makes sense, since nobody is going to make any money trying to solve problems that cannot be solved.

One note about complexity classes: I refer to the complexity of problem classes, not algorithms. For example, sorting is a problem, shell sort is an algorithm that solves the sorting problem.  So, saying sorting is an easy problem means that there exists some algorithm that solves it easily. Saying a problem is hard means there is no algorithm that solves it easily.

Since all design and verification related problems fall into the hard-to-solve category, this is what I will talk about most. However, the boundary between easy-to-solve and hard-to-solve problems gets very blurry when we start to look at problems in the real world.

But, before talking about this, let’s first look at a basic property of these different classes – scaling. Suppose we had an algorithm that computed a result for a problem instance of size N (say, sorting a list of size N) in time T. Now suppose we had a processor that ran ten times faster. How much larger problem size could our algorithm handle in the same amount of time T? The following table shows how N grows for a CPU performance increase of 10X for different complexities:

  • linear O(N) N -> 10N
  • O(N lgN) N ->  5.7N
  • O(N^2) N -> 3.2N
  • O(2^N) N -> N+3.4

What this shows is that capacity gains from increased processor performance  diminish with increased problem complexity. For algorithms that have exponential complexity, there is basically no significant gain in capacity from increased processor speed. For example, suppose I had an exponential verification algorithm that could handle 100 variables. Having a 10X faster processor would allow me to process 103 variables in the same amount of time. Considering that Moore’s law implies that the number of variables doubles every year or so, this does not bode well for our ability to keep up.

But, the fact is that verification does  manage to keep up with increased design sizes, albeit with difficulty. So, verification is not always an exponential problem. We will explore the reasons for this in the next post.

I had big intentions to explain computational complexity in easy to understand terms to make dealing with this issue in design and verification easier. However, there is a lot of material and making it as intuitive as possible turned out to be a lot more work than I thought. Thus, no posts for a while.

In the meantime, it occurred to me that computational complexity affects our every day lives in ways that we are not aware of.  I think having some awareness these situations helps in understanding them when they are encountered it in designing complex systems.

When my son, Rory, was 1 1/2 years old, his Aunt Claire gave him a train set. If you have a boy, you probably have a train set also. Rory loved playing with this set. However, as a two-year old is apt to do, when he got frustrated, his way of taking it out was to tear up the train tracks, throwing them all over the place. Over time, we bought more and more track, which meant it took longer to do the rebuilding after each of Rory’s, um, deconstructions. After doing this numerous times, sometimes taking hours to get it just right, it dawned on me that I was solving an NP-hard problem.

NP-hard is one of those terms that you have probably heard, don’t completely understand, but believe means the problem is intractable. I will try to define it more precisely in a future post, but what I want to convey here is the intuition behind my belief that creating a track layout is NP-hard.

The goal of laying out the track is to make a loop (actually multiple interconnected loops) while using all the available track and, at the same time, minimizing the area (our house isn’t that big!). This is made more complex by the fact that there are a fixed number of different pieces, Y junctions, and bridges.

The first thing you notice when doing this is that it is easy to create a simple loop – four turns and a few straight pieces suffice. The problem, of course, is that this is not very interesting and you still have 90% of the pieces left over. It is also easy to use all the pieces if you don’t mind that the loop doesn’t close. You can just make one long chain of track. When Rory was old enough to know how to connect the track together, this is exactly what he did. It looked like we had a giant insect in the middle of our room. This illustrates one of the characteristics of NP-hard problems. Even though the problem (creating a track layout) is hard in general, there can be instances of the problem that are easy to solve.

The next you notice is that, you get an idea for a layout, start to build it and then, at the end, you find you are one piece short. You try to find  a piece by shortening the circuit somewhere else. This may entail making changes in other places also, which requires changing yet another place, and so on. More than a few levels of this and you quickly get lost and forget what the original problem was. At this point, you give up and start making wholesale changes, often running into the same problem.

This illustrates two issues that are at the crux of matter. First, you generally don’t figure out that a layout is going to work or not until it is completely built, or nearly so. Second, while it is sometimes possible to fix a layout by making incremental changes, it is often the case you cannot, which means starting from scratch again. The key here is that when we try a given configuration and it doesn’t work and we can’t fix it by making some incremental changes, we haven’t learned what a good configuration is, only that the current one is a bad one. In the worst case, then,  we would have to try every possible track configuration to find one that met our requirement of using all pieces, in a loop, and with minimum area. This is pretty much the definition of intractable.

Contrast this with a problem that is tractable, such as sorting. Suppose we have an almost sorted list. It’s never the case that we would have to start all over. The closer the list is to being sorted, the less time it takes to get it completely sorted.

The last observation that indicates that this problem is NP-hard is due to our desire to minimize the area of the resulting track layout. It is actually computationally tractable to meet the requirement of producing a loop using all available track. We can just make one giant loop. Even this is tedious, but the problem is  it would be much too large to fit in the room. Our goal is to minimize the area so that we can actually, you know, live in our living room. It turns out that the less area we want to use, the harder it is to figure out a layout. Also, it is all but impossible to determine what the minimum sized layout that meets our requires would be. These are both hallmarks of NP-hard problems. In practice, because it is impossible to determine what the minimum sized solution is, usually a limit is specified. As long as the solution produces a result less than (or greater, depending on the problem being solved) the limit, the solution is deemed a satisfying solution. When an NP-hard problem has a limit specified like this, it is called an NP-complete problem.

I hope this helps in understanding computational complexity. I will write more later. I doubt this helps in understanding how to raise kids.

It looks like this was a timely post. Just a few days after my last post, Harry  (theAsicGuy) Gries posted an entry speculating that having thousands of computers available would revolutionize EDA algorithms. I think with an understanding of how computational complexity affects EDA algorithms, he might not be making these claims.

Let’s start by looking at what he said:

One of the key benefits of the cloud is that anyone can have affordable access to 1000 CPUs if they want it. If that is the case, what sorts of new approaches could be implemented by the EDA tools in addressing design challenges? Could we implement place and route on 1000 CPUs and have it finish in an hour on a 100M gate design? Could we partition formal verification problems into smaller problems and solve what was formerly the unsolvable? Could we run lots more simulations to find the one key bug that will kill our chip? The cloud opens up a whole new set of possibilities.

Harry is wondering how parallelism can help EDA algorithms. The goal of parallelism is mainly to allow completing a given task faster. Problems can be broken into two classes: those that are embarrassingly parallel and all others.

Embarrassingly parallel problems are easy to parallelize because problem instances are completely independent. In EDA, there are a number of embarrassingly parallel problems:

  • regression testing – you have thousands of tests. They are all independent. You can run them on as many machines as are available to complete the regression as quickly as possible.
  • synthesis – different blocks can be synthesized in parallel.
  • spice – run different corners in parallel.
  • placement – place blocks in parallel.

This is done routinely today. Even small companies have dozens of servers, if not hundreds to do these, and other,  embarrassingly parallel tasks.

But the real question Harry is asking is: can having thousands of CPUs solve the other kind of parallelism. Can a parallel simulator run a test a thousand times faster? Can we increase the capacity of formal verification by a factor of a thousand? Can we reduce place-and-route time down to minutes?

Sorry, Harry, but the answer to these questions is: no. Suffice it to say that, if it could have been done, it would have been done already. It is not from lack of trying that EDA algorithms are not parallelized today. It is not from lack of CPUs either.There generally are no gains running parallelized algorithms on even 10 CPUS, far less 100 or 1000. If any EDA company could design a parallel algorithm that gave 10X speedup running on, say 20 CPUs, they would have a huge advantage over the competition.

In short, if it was easy to get large speedups just by throwing more CPUs at the problem, it would be embarrassing that nobody has done that yet.