Skip navigation

Problems that we use computers to solve can be divided into three basic classes:

  • easy to solve
  • hard to solve
  • impossible to solve

Pretty much all the algorithms we use day-to-day fall into the easy class. Text editing, web browsing, searching, even problems such as weather prediction, which run on supercomputers, are examples of easy to solve problems. These problems are solvable with algorithms that are called polynomial-time (P). This means that, given a problem size of N, an algorithm exists to solve the problem in time that is proportional to a fixed power of N, written as O(N^k). For example, sorting a list of N words is solvable in O(N^2) time.

The class of hard-to-solve problems pretty much includes all problems involved in the creation and verification of a design. Synthesis, place-and-route, formal verification, all fall in this class. You  probably have heard the term NP-complete. NP-complete is just one class of many in the hard-to-solve category. The class of NP-complete problems is the easiest of the hard classes, which means that, often, these types of problems can be solved reasonably well. The hardest of the hard-to-solve classes that you are likely to encounter  is the PSPACE-complete class. Problems in this class are, for all intents of purposes, intractable. We will look at these two classes, plus another one that occurs frequently in design problems, the NP-hard class.

Both the easy and hard-to-solve classes are at least theoretically possible to solve. The last group, the impossible-to-solve group, consist of the set of uncomputable problems. These problems mostly are of theoretical interest, which makes sense, since nobody is going to make any money trying to solve problems that cannot be solved.

One note about complexity classes: I refer to the complexity of problem classes, not algorithms. For example, sorting is a problem, shell sort is an algorithm that solves the sorting problem.  So, saying sorting is an easy problem means that there exists some algorithm that solves it easily. Saying a problem is hard means there is no algorithm that solves it easily.

Since all design and verification related problems fall into the hard-to-solve category, this is what I will talk about most. However, the boundary between easy-to-solve and hard-to-solve problems gets very blurry when we start to look at problems in the real world.

But, before talking about this, let’s first look at a basic property of these different classes – scaling. Suppose we had an algorithm that computed a result for a problem instance of size N (say, sorting a list of size N) in time T. Now suppose we had a processor that ran ten times faster. How much larger problem size could our algorithm handle in the same amount of time T? The following table shows how N grows for a CPU performance increase of 10X for different complexities:

  • linear O(N) N -> 10N
  • O(N lgN) N ->  5.7N
  • O(N^2) N -> 3.2N
  • O(2^N) N -> N+3.4

What this shows is that capacity gains from increased processor performance  diminish with increased problem complexity. For algorithms that have exponential complexity, there is basically no significant gain in capacity from increased processor speed. For example, suppose I had an exponential verification algorithm that could handle 100 variables. Having a 10X faster processor would allow me to process 103 variables in the same amount of time. Considering that Moore’s law implies that the number of variables doubles every year or so, this does not bode well for our ability to keep up.

But, the fact is that verification does  manage to keep up with increased design sizes, albeit with difficulty. So, verification is not always an exponential problem. We will explore the reasons for this in the next post.

I had big intentions to explain computational complexity in easy to understand terms to make dealing with this issue in design and verification easier. However, there is a lot of material and making it as intuitive as possible turned out to be a lot more work than I thought. Thus, no posts for a while.

In the meantime, it occurred to me that computational complexity affects our every day lives in ways that we are not aware of.  I think having some awareness these situations helps in understanding them when they are encountered it in designing complex systems.

When my son, Rory, was 1 1/2 years old, his Aunt Claire gave him a train set. If you have a boy, you probably have a train set also. Rory loved playing with this set. However, as a two-year old is apt to do, when he got frustrated, his way of taking it out was to tear up the train tracks, throwing them all over the place. Over time, we bought more and more track, which meant it took longer to do the rebuilding after each of Rory’s, um, deconstructions. After doing this numerous times, sometimes taking hours to get it just right, it dawned on me that I was solving an NP-hard problem.

NP-hard is one of those terms that you have probably heard, don’t completely understand, but believe means the problem is intractable. I will try to define it more precisely in a future post, but what I want to convey here is the intuition behind my belief that creating a track layout is NP-hard.

The goal of laying out the track is to make a loop (actually multiple interconnected loops) while using all the available track and, at the same time, minimizing the area (our house isn’t that big!). This is made more complex by the fact that there are a fixed number of different pieces, Y junctions, and bridges.

The first thing you notice when doing this is that it is easy to create a simple loop – four turns and a few straight pieces suffice. The problem, of course, is that this is not very interesting and you still have 90% of the pieces left over. It is also easy to use all the pieces if you don’t mind that the loop doesn’t close. You can just make one long chain of track. When Rory was old enough to know how to connect the track together, this is exactly what he did. It looked like we had a giant insect in the middle of our room. This illustrates one of the characteristics of NP-hard problems. Even though the problem (creating a track layout) is hard in general, there can be instances of the problem that are easy to solve.

The next you notice is that, you get an idea for a layout, start to build it and then, at the end, you find you are one piece short. You try to find  a piece by shortening the circuit somewhere else. This may entail making changes in other places also, which requires changing yet another place, and so on. More than a few levels of this and you quickly get lost and forget what the original problem was. At this point, you give up and start making wholesale changes, often running into the same problem.

This illustrates two issues that are at the crux of matter. First, you generally don’t figure out that a layout is going to work or not until it is completely built, or nearly so. Second, while it is sometimes possible to fix a layout by making incremental changes, it is often the case you cannot, which means starting from scratch again. The key here is that when we try a given configuration and it doesn’t work and we can’t fix it by making some incremental changes, we haven’t learned what a good configuration is, only that the current one is a bad one. In the worst case, then,  we would have to try every possible track configuration to find one that met our requirement of using all pieces, in a loop, and with minimum area. This is pretty much the definition of intractable.

Contrast this with a problem that is tractable, such as sorting. Suppose we have an almost sorted list. It’s never the case that we would have to start all over. The closer the list is to being sorted, the less time it takes to get it completely sorted.

The last observation that indicates that this problem is NP-hard is due to our desire to minimize the area of the resulting track layout. It is actually computationally tractable to meet the requirement of producing a loop using all available track. We can just make one giant loop. Even this is tedious, but the problem is  it would be much too large to fit in the room. Our goal is to minimize the area so that we can actually, you know, live in our living room. It turns out that the less area we want to use, the harder it is to figure out a layout. Also, it is all but impossible to determine what the minimum sized layout that meets our requires would be. These are both hallmarks of NP-hard problems. In practice, because it is impossible to determine what the minimum sized solution is, usually a limit is specified. As long as the solution produces a result less than (or greater, depending on the problem being solved) the limit, the solution is deemed a satisfying solution. When an NP-hard problem has a limit specified like this, it is called an NP-complete problem.

I hope this helps in understanding computational complexity. I will write more later. I doubt this helps in understanding how to raise kids.

At this year’s DAC, Bill Dally, Stanford professor and Chief Scientist at Nvidia, gave a keynote speech in which he claimed that Amdahl’s law is not a law, but an observation. This uses the popular interpretation of law to mean “proven correct”. However, in scientific terms, stating something is a law has nothing to do with whether it is correct or not.

In scientific terms, a law is a relationship between two or more entities. Some well know laws are:

  • Newton’s second law of motion
    • F = mA
  • Ohm’s law
    • E = Ir
  • Newton’s law of universal gravitation
    • F = G*m1*m2/r^2

Newton’s second law relates force, mass, and acceleration. Ohm’s law relates voltage, current and resistance. Newton’s law of universl gravitation relates gravitation force to the mass and distance of two bodies.

A law allows us to derive the value of an unknown quantity from other known quantities. Knowing voltage and resistance, Ohm’s law gives us current.

What quantities does Amdahl’s law relate? Amdahl’s law is a relationship between parallelism, number of processors, and speedup. It is usually used to compute the maximum speedup that can be obtained for an application given an infinite number of processors. Parallel speedup is limited by the portion of the application that is sequential (cannot be parallelized). For example, if 10% of the execution time cannot be parallelized, the maximum speedup obtainable through parallelism is 10X.

So, Amdahl’s law does meet the requirements for being a scientific law. Then, what do we make of Dally’s statement? Other than trying to be provocative, I think the point he was trying to make is that, in the applications he is addressing, the sequential portion is essentially zero. In this case speedup is equal to the number of processors. But, this still follows Amdahl’s law.

It looks like this was a timely post. Just a few days after my last post, Harry  (theAsicGuy) Gries posted an entry speculating that having thousands of computers available would revolutionize EDA algorithms. I think with an understanding of how computational complexity affects EDA algorithms, he might not be making these claims.

Let’s start by looking at what he said:

One of the key benefits of the cloud is that anyone can have affordable access to 1000 CPUs if they want it. If that is the case, what sorts of new approaches could be implemented by the EDA tools in addressing design challenges? Could we implement place and route on 1000 CPUs and have it finish in an hour on a 100M gate design? Could we partition formal verification problems into smaller problems and solve what was formerly the unsolvable? Could we run lots more simulations to find the one key bug that will kill our chip? The cloud opens up a whole new set of possibilities.

Harry is wondering how parallelism can help EDA algorithms. The goal of parallelism is mainly to allow completing a given task faster. Problems can be broken into two classes: those that are embarrassingly parallel and all others.

Embarrassingly parallel problems are easy to parallelize because problem instances are completely independent. In EDA, there are a number of embarrassingly parallel problems:

  • regression testing – you have thousands of tests. They are all independent. You can run them on as many machines as are available to complete the regression as quickly as possible.
  • synthesis – different blocks can be synthesized in parallel.
  • spice – run different corners in parallel.
  • placement – place blocks in parallel.

This is done routinely today. Even small companies have dozens of servers, if not hundreds to do these, and other,  embarrassingly parallel tasks.

But the real question Harry is asking is: can having thousands of CPUs solve the other kind of parallelism. Can a parallel simulator run a test a thousand times faster? Can we increase the capacity of formal verification by a factor of a thousand? Can we reduce place-and-route time down to minutes?

Sorry, Harry, but the answer to these questions is: no. Suffice it to say that, if it could have been done, it would have been done already. It is not from lack of trying that EDA algorithms are not parallelized today. It is not from lack of CPUs either.There generally are no gains running parallelized algorithms on even 10 CPUS, far less 100 or 1000. If any EDA company could design a parallel algorithm that gave 10X speedup running on, say 20 CPUs, they would have a huge advantage over the competition.

In short, if it was easy to get large speedups just by throwing more CPUs at the problem, it would be embarrassing that nobody has done that yet.

One of the projects I worked on, some time ago, was to improve the efficiency of the physical design process. The goal was to reduce the time required to get from RTL ready to tapeout from nine months to six months. We looked at all aspects of the flow to find bottlenecks and streamline the flow. One of the major problems that emerged from this was how little hardware designers understand computational complexity and its impact on the design process.

As an example, one of the issues that the logic designers wanted addressed in the physical design flow was the fact that scan nets were given equal priority during place and route to the functional nets. Scan is used for testing only and is not performance critical, while the functional nets are very performance critical. Place-and-route tools generally don’t have the knowledge to know which nets are critical and which are not. The result is that the more critical nets are less optimal than they could be.

The proposal to fix this was to do the place and route in two steps. First, a netlist would be created with no scan nets. This netlist would be placed. Presumably, the placer would place all registers optimally with respect to the critical functionality nets only. This would guarantee, that when the critical nets were routed, they would be optimal. Second, a full netlist including all scan nets would be created, which would be placed and then routed. However, all registers would be “locked” into place as determined by the first placement run. Since these placements are supposedly optimal, the critical nets should end up be routed more optimally.

I told them that locking the registers in place for the second run was going to over-constrain the placer and would result in far less than optimal results, just based on understanding the computational complexity of place-and-route. The lead logic designer, who had proposed this scheme, argued that the placement was proven to be optimal, so it shouldn’t matter that the register placements were being locked down. There was a lot of discussion on this, but no amount of argument was going to convince this designer that his scheme wouldn’t work.

As an experiment, we decided to do a trial place-and-route on two different netlists, with only minor differences between them. Neither had scan hooked up (comparing a netlist with scan to one without wouldn’t prove anythning). The results were that, in one case all the registers in one floorplan room ended up one corner of the room and, in the other, they all ended up in the diagonally opposite side. When the designer who proposed this scheme saw that, he said, “I didn’t expect that.” I said, “I did.” Needless to say, we abandoned this scheme shortly thereafter.

The fundamental property that this exhibits is non-linearity of response – a small change in the input causes a large change in the output. This property is inherent when using complex algorithms to create and verify designs. I don’t know if today’s place-and-route tools handle the scan net problem, but even if they do, I still encounter many situations in which there is an expectation of proportional response to a design change when reality is that the response is non-linear.

Formal verification, for example, is plagued by this problem. But, it can even occur in simulation. You are probably familiar with the experience of spending a lot of time tuning a constrained random environment to get a certain level of coverage. But, then the designer makes a minor change, and suddenly you get much lower coverage and then have to spend a lot of effort to get back to where you were. This particular problem is one that we are trying to address at Nusym with our intelligent test generation.

The bottom line for this unpredictable behavior is that it is unavoidable. But, I still find the expectation that design can be made predictable common (especially amongst managers). I plan to publish some posts trying to demystify computational complexity (complexity for dummies 😉 ). I will try to frame this in a way that even managers can understand, which, hopefully should help lead to better decision making with respect to planning the amount of effort that needs to go into using complex algorithms in design.

There was an interesting article on verification methodology in Chip Design Magazine recently. The author, Carl Ruggiero, works at an IP supplier and, so, doesn’t have any particular agenda to push with respect to verification methodology (unlike most of the articles in this magazine).

He makes the following points:

  • quality of verification is not correlated to quantity of verification.
  • Directed testing and constraint-based random testing can both be equally successful or unsuccessful.
  • quality of verification is not correlated to the language chosen.
  • Good verification planning, execution, and tracking are the keys to producing a high-quality (low bug) design.

The interesting thing about this is that Ruggiero states that these things were surprising to him. If you understand the concepts of Bugs are Easy, none of these things should be surprising. Let’s try to put these statements in the context of the three laws of verification and the orthogonality concept.

We know that putting effort on multiple, orthogonal methods is better than putting all the effort on a single method. This alone can explain why high verification effort  can fail to produce a high quality design compared to using low verification effort.

We know that directed and random testing are orthogonal methods and both are capable of finding a majority of bugs. We also know that either can appear to be efficient or inefficient depending on how they are deployed. Thus, it is not surprising that Ruggierio sees different groups having different levels of success for random and directed testing.

I’ll come back to languages in a minute.

His conclusion that good planning, execution, and tracking is the key to producing a high quality design, however, is counter to the principles of Bus are Easy because it is essentially a statement there is an absolute best methodology. First, I think Ruggiero would take issue with calling good planning a methodology. After all, isn’t good planning essential to any successful endeavor? How could this be a methodology if all methodologies require good planning? Well, it turns out that back in 1996, two researchers from DEC proposed a verification methodology in which the central tenet was to not do any planning (Noack and Kantrowitz, DAC, 1996) (Sorry, can’t find an online version to link to). Their reasoning, which should sound familiar, was that no matter what you did at the beginning of verification, you would find bugs, so why bother spending a lot of time planning up front. We used this methodology successfully on the MCU chip that I worked on at HAL.

So, planning is a methodology, not planning is a methodology and both are, therefore, subject to scrutinization using the laws of verification. The fact that planning was successful does not mean it is the best methodology. Not planning has also proven successful.

Now let’s return to the issue of languages. Ruggiero states that he has seen simple Verilog-based environments produce high-quality designs and complex HVL-based environments produce low-quality designs. He goes on to further say

…a commitment to execute it turned out to be far more important than the tools chosen to implement it…

where , in this context, “tools” refers to languages. This conflation of tools as languages is made more explicit in his concluding paragraph:

…methodology matters far more than tools in delivering working hardware designs. While certain EDA languages can make engineers more productive…

There is a clear assumption in his mind as he equates tools and languages: languages are technology. That is, there is something about advanced languages that makes testbenches written using these languages more likely to find bugs or get higher coverage or whatever. While advanced languages certainly are useful and enhance productivity by including features that you probably would have to create manually, nothing about them is inherently smarter or better with respect to finding bugs. It’s like saying it’s better to design in English than in Chinese. Or that if you have power steering in your car, you are less likely to get lost than if you have manual steering. The languages are technology argument makes no sense, but as Ruggiero’s article shows, this mindset is pervasive.

Ruggiero correctly concludes that languages are not important to final quality, but he misses the more fundamental conclusion: languages are not technology.

I talked previously about comparing methodologies and the that the key to choosing different methodologies is to choose those that are orthogonal. I also talked about how to decide whether a new method is worth doing or not. In this post, I will compare two methodologies, simulation and formal, in more detail using the concepts I developed in these previous posts.

When starting out with simulation, you may write some simple directed tests or the simplest random environment that exercises only the most basic cases. This represents a very over-constrained environment, that is, it tests a very small subset of the legal input space. This is insufficient to find all bugs, so generally you will modify the environment to relax the constraints such that it can test more of the input space. New bugs are found as new parts of the input space are explored.

Generally, the hard part of increasing the input space is refining the checker. For example, it is easy to inject errors on the input side, but extremely difficult to check results – should the packet be discarded? which packet? How many packets? Another example is configuration. The initial environment usually only exercises a single configuration. Each configuration bit adds some amount of work to verify correctness and, in a complex system, there are usually many configuration bits and combinations of configuration bits to be tested.

As a consequence of this, it becomes more time consuming to increase the input space that is being exercised. And the closer it gets to the exact legal input space, the harder it becomes to get there. This is one of the main causes of the verification bottleneck. At the same time, there is no way to short circuit this because not getting to the point of exercising the complete input space means potentially missing easy bugs that could be show stoppers.

This is where assertion-based verification (ABV) comes in. Rather than starting with a very over-constrained environment and then refining it by gradually loosening the constraints, ABV starts with a highly under-constrained environment and, as constraints are added, gradually restricts it toward the exact legal input space.

The problem in reducing the over-approximation inherent in assertions is analogous to expanding the under-approximation in simulation. The input space is restricted by adding more constraints. The closer the input space gets to the legal input space, the harder this becomes. The reasons are similar to simulation, but there is an additional factor with assertions, namely, that is basically impossible to specify high-level behavior using assertions, which means it is not possible to completely verify a design using assertions only. This means that you will never get to point of constraints specifying the exact input space, it will always be under-constrained to at least some extent.

Effort required to constrain simulation and formal to the exact legal input space

Effort required to constrain simulation and formal to the exact legal input space

So, why use assertions at all? The fact that assertions approach the problem from the opposite end of the input space spectrum means that assertions are an orthogonal method of verification. They force the user to view the design in a very different way. This has two benefits. First, bugs that are hard to find using simulation may be easy to find using assertions and/or formal. Second, assertions make it easier to debug because the error is generally caught closer to the source of the bug (in fact, I believe this is the prime advantage of using assertions).

What is the downside of using assertions and formal? First, since it is not possible to completely verify using assertions alone, nobody is going to abandon doing random and directed testing and, generally, this will be done first. As we have seen before, whatever is done first will find the most bugs. This means that assertions and formal generally are relegated to finding a small number of bugs. This is OK, since these methods are orthogonal to random and directed testing, as long as it is not too much effort.

Unfortunately, writing assertions is very time consuming, error prone, and because it is under-constrained, prone to false negatives (bugs) that cause a lot of frustration and wasted effort. Using formal is another large effort on top of writing the assertions in first place.

Bottom line, assertions and formal are orthogonal methods to simulation and, therefore, useful. But, their return-on-investment is low compared to simulation. If you have the budget, time, and manpower, or verification is a critical problem for you, they are probably worth doing, otherwise, they are probably not.

What is formal verification? Today, even if you ask experts, you will not get a clear answer.

Here are a few definitions you will find:

  • “formal verification of is the act of proving or disproving the correctness … using formal methodsmathematics.” <wikipedia>
    • “formal methods are particular kind of mathematically-based techniques for verification …” <wikipedia>
  • “Use of various types of logic and mathematical methods to verify the correctness of IC logic…” <edacafe>
  • “Establishing properties of hardware or software designs using logic, rather than (just) testing…” <nist>

None of these definitions is correct, or even meaningful, for that matter. What they are trying to get at is that formal verification is not simulation. There is somehow mathematics involved, but it is not clear how or why this is significant.

The sad part about this is that these definitions largely come from experts in the formal verification world. The fact is researchers don’t really know how to define formal verification in a way that clearly differentiates it from simulation. Considering that most researchers in the area have a theoretical background and tend to look down their noses at simulation as a verification method, this identity crisis is a real problem for them.

The real goal of formal verification is to prove things correct rather than merely to test them. To understand the origin of the crisis, we first have to understand what it means to prove something. There are two fundamental properties of any proof method: soundness and completeness. A proof system is sound if what it proves is also true. For example, we would not want a proof system that proves 2 +2 = 5. A proof system is complete if everything that is true is proven. Suppose we have a correctly working adder. A complete proof would have had to verify correctness for all possible input values since all possible input values result in the correct answer.

Soundness underlies any verification method. Simulation and formal verification are both sound. When university research labs produced with the first practical formal verification method, BDD-based model checking, the early 1990s , it was easy to differentiate formal verification from simulation. Formal verification was complete and simulation was incomplete. But, very quickly it became apparent that these first industry-ready formal verification tools were still far too capacity limited to be useful on real designs.

Semi-formal verification was proposed as the way forward. What is semi-formal verification? It’s formal verification, except its not complete. The idea was to give up on proving things and just focus on finding more bugs. But, simulation is also incomplete. So, what’s the difference between the two? And, can we come up with a better definition of formal verification (including semi-formal verification) that clearly distinguishes it from simulation in a useful way?

The answer comes, somewhat obviously, from the definition of the word formal. Surprisingly, though, the word formal does not refer to the use of mathematics. Here is a definition from Webster’s:

formal: … b) relating to or involving the outward form, structure, relationships, or arrangement of elements rather than content…

In the context of verification, the word formal refers to looking at the structure of the design rather than its behavior. For example, suppose our adder is composed of gates. A formal verification -based proof system would represent the structure of these gates (type and connections) as symbols and create equations describing the relationships between them. The adder would then be represented by one (very large) equation. Then to prove the adder is correct, you could create the equation:

enormous_equation_representing_structure_of_adder = A+B

Then it is a “simple” matter of doing algebraic manipulation to show that both sides of the equation are the same, similar to how one would prove the algebraic identity, “2A – A = A”. Once you were done, you would have a complete proof because the symbols A and B represent all possible values of inputs.

OK, that is more an explanation of how formal verification works than its essence and it doesn’t explain what incomplete formal verification is. The key point here is the use of symbols to represent structure and symbolic manipulation to prove results. If a proof system is using symbols and manipulating them directly, it is formal in some sense. Incomplete formal verification just means that we did not prove it for all possible values of the symbols. For example, suppose for the equation “2A – A = A”, we could only prove it for A>0. This is incomplete because we did not verify the equation for all values. This is still better than simulation because we did prove it for many more values than if we simply tested a large number of values. This is the rationale for semi-formal verification; that it can cover more cases than simulation, but can be incomplete.

But, just saying semi-formal can potentially cover more cases than simulation is still not a clear differentiation from simulation. There has to be something more. The answer lies in the use of symbols and symbolic manipulation. Even if incomplete, a system that uses symbols has an idea of what has been not been proven in addition to what has been proven. This means that an incomplete proof system can give the user information about what was not proven. Simulation-based proof systems only give information on what was proven.

Therefore, a useful definition of these different proof methods is:

  • simulation: sound, incomplete, not aware of what is not proven.
  • semi-formal verification: sound, incomplete, aware of what is not proven.
  • formal verification: sound, complete.

Over the years, there has been much discussion and argument over which of several competing methodologies is the best one for verification. Competing papers have been written arguing opposite sides of this question, even to the point of each including data “proving” their side.

As a case study, let’s take the example of random vs. directed testing. A lot has been published over the last twenty years comparing these two methods. Some papers state that random finds bugs more efficiently than directed. Same state the opposite. No conclusion has emerged from all the conflicting data. Why is this? Is there any way to resolve this?

Using the framework of the the three laws of verification, it is possible to synthesize a consistent explanation for all the conflicting data. First, to recapitulate:

1st law: the bug rate is high at the beginning of the verification effort, low at the end

2nd law: bug hardness is subjective.

3rd law: bug hardness is independent of when bugs are found during the verification process.

Let’s assume that random and directed testing will each take six months effort and each will find 90% of bugs.

Suppose we do random testing first. We spend six months developing a random testbench, running it and finding and fixing 90% of the bugs. If we now spend six months developing directed tests, the best we can accomplish is to find 90% of the remaining 10%, or 9% of the bugs. But, the amount of effort was the same. Conclusion, random is much more efficient than directed testing.

Now suppose we did directed testing first, followed by random. Now, directed testing would find 90% of the bugs with six months effort and random would find only 9% with the same effort. Conclusion, directed testing is more efficient.

Thus, measured efficiency is a function of the order that verification methodologies are applied. Whichever is done first will look the most efficient. This is a direct result of the first law of verification. Without context on how these methods are applied, there is no way to judge the conclusions presented.

Another issue is that papers often conclude with a statement like “…and we found a bug that wasn’t found before”, with the implication that this method is better than the previous methods. We can also examine this in the context of the three laws of verification.

The best verification methodology is one that combines as many orthogonal methods as possible. Methods are orthogonal if bugs that are hard to find in one method are easy in the other (second law), or bugs that would be found later in one method are found earlier in the other(third law).

Consequently, orthogonality is the best metric to use when comparing methods. The fact that one method finds a bug that was missed by another method is an indication that the two methods are orthogonal. But, that is not the whole story. We need to take into account the scope of the different methods to determine which is actually better.

People often infer from statements such as, “we found a bug that was missed”, that the method would have found all bugs that were previously also and, therefore, this method can replace the old method. This is rarely the case. In the worst case, it could be that this method could only find this one bug.

To judge fairly two methods, it is necessary to do two experiments. Let’s say we want to evaluate whether to replace method A with method B. First, we do method A followed by method B. Let’s say method A finds 100 bugs and method B finds 10 bugs. Method B looks good because it found bugs that were missed by method A, which, because its our current method, we believe to be thorough.

Second, we try method B first followed by method A. If method B finds 20 bugs and method A finds 90 bugs which method B missed, we would say that method A is vastly superior even though the first experiment showed that method B had promise. If instead, method B found 90 bugs and method A found ten bugs, then you could claim they are equal (assuming equal effort). Only if method B found more than 90 bugs when done first could any claim of it being better be made. This is a very tall order, which is why most of the verification papers written with results claiming to have found bugs that were missed generally don’t have much impact.

So, what have we learned? First, lacking context, claims of one method being more effective/efficient or better are meaningless. Second, rather than focusing on which methods are best, focus on which methods are most orthogonal. Methodologies often reflect mindset. Orthogonal methodologies force different mindsets such that  simple bugs don’t slip through, which is the key to ensuring the highest probability of success.

Somehow, the powers that be decided that my post on languages was worthy of a wider audience. We made it to Deep Chip! Deep Chip is John Cooley’s site that is read by every hardware designer on the planet.

Here is the link:

http://deepchip.com/wiretap/090219.html

If anyone wants to comment on this and somehow finds this post (unfortunately, John doesn’t allow “self promotion”) , please feel free to post comments.