Skip navigation

Category Archives: design

In an article in a recent issue of Computer entitled “Really Rethinking Formal Methods”, David Parnas questions the current direction of formal methods research. His basic claim is that (stop me if this sounds familiar) formal methods have too low ROI and researchers, rather than proclaiming the successes, need to recognize this and adjust their direction. As he so eloquently puts it:

if [formal methods] were ready, their use would be widespread

I haven’t spent a lot of time trying to figure out if his proscriptions make sense or not, but one thing stood out to me. He talks about a gap between software development and older engineering disciplines. This is not a new insight. As far back as the 60’s, the “software crisis” was a concern as the first large complex software systems being built started experiencing acute schedule and quality problems. This was attributed to the fact that programming was a new profession and did not have the rigor or level of professionalism of engineering disciplines that had been around for much longer. Some of the criticisms heard were:

  • programmers are not required to have any degree, far less an engineering degree.
  • programmers are not required to be certified.
  • traditional engineering emphasizes using tried and true techniques, while programmers often invent new solutions for every problem.
  • traditional engineering often follows a rigorous design process, programming allows hacking.

These explanations are often used as the excuse when software (usually Microsoft software) is found to have obvious and annoying bugs. But is this really the truth? Let’s look at an example of traditional engineering to see if if this holds up.

Bridge building is technology that is thousands of years old. There are still roman bridges built two thousand years ago that are in use today. Bridges are designed by civil engineers who are required to be degreed, certified engineers. Bridge design follows a very rigorous process and is done very conservatively using tried and true principles. Given that humanity has been designing bridges for thousands of years, you would think that we would have gotten it right by now.

You would be wrong.

Even today, bridges are built with design flaws that result in accidents and loss of life. One could argue that, even so, the incidence of design flaws is far less in bridges than in software. But this is not really an apples to apples comparison. The consequences of a bug in, say, a web browser are far less than a design flaw in a bridge. In non-safety critical software, economics is a more important factor in determining the level of quality of software. The fact is, most of the time, getting a product out before the competition does is economically more important than producing a quality product.

However, there are safety critical software systems, such as airplanes, medical therapy machines, spacecraft, etc. It is fair to compare these systems to bridges in terms of catastrophic defect rates. Let’s look at one area in particular, commercial aircraft. All commercial aircraft designed in the last 20 years rely heavily on software and, in fact, would be impossible to fly if massive software failures were to occur. Over the past 20 years, there have been roughly 50 incidents of computer-related malfunctions, but the number of fatal accidents directly attributed to software design faults is maybe two or three. This is about the same rate of fatal bridge accidents attributable to design faults. This seems to indicate that this gap between software design and traditional engineering is not so real.

The basic question seems to boil down to: are bridges complex systems?  I define a complex system as one that has bugs in it when shipped. It is clear that bridges still have that characteristic and, therefore, must be considered as complex systems from a design standpoint. The intriguing question is, given that they are complex systems, do they obey the laws of designing complex systems? I believe they do and will illustrate this by comparing two bugs, one a bridge design fault and another a well known software bug.

The London Millennium Footbridge was completed in 2000 as part of the millennium celebration. It was closed two days after it opened due to excessive sway when large numbers of people crossed the bridge. It took two year and millions of pounds to fix. The bridge design used the latest design techniques, including software simulation to verify the design. Sway is a normal characteristic of bridges. However, the designers failed to anticipate how people walking on the bridge would interact with the sway in a way to magnify it. The root cause of this problem is that, while the simulation model was probably sufficiently accurate, the environment, in this case, people walking on the bridge, was not accurate.

This is a very common syndrome in designing complex hardware systems. You simulate the chip thoroughly and then when you power it up in the lab, it doesn’t work in the real environment. I describe an example of this exact scenario in this post.

In conclusion, it does seem that bridges obey the laws of designing complex systems. The bad news is that the catastrophic failure rate of safety-critical software is of roughly the same magnitude as that of bridges. This means that we cannot expect significant improvements in the quality of software over the next thousand years or so. On the plus side, we no longer need to buy the excuse that software development is not as rigorous as “traditional” disciplines such as building bridges.

The computational complexity class, NP-hard, is at the core of a number of problems we encounter on a daily basis, from loading the dishwasher (how do I get all these pots to fit?) to packing a car for a vacation, to putting together a child’s train tracks.

If we look at these things, they have several things in common. First, they each involve a potentially large number of parts (pots, luggage, pieces of track) that need to be put together in some way. Second, we want to meet some objective, such as fitting all the dishes in the dishwasher. Third, there are a large number of constraints that must be met. In the case of loading the dishwasher, no two dishes can be put in the same place. There are N^2 constraints just to specify this, among many others. A fourth characteristic is that we may get close to an optimal solution, but find it difficult and not obvious how to get to a more optimal one (just how are we going to fit that last pot in the dishwasher). Furthermore, getting from a near optimal solution to an optimal one may involve a complete rearrangement of all the pieces.

One way to solve problems like packing a dishwasher is to view it as a truth table. Each dish can be put in one of, say, 100 slots, in, say, one of ten different orientations. This results in 1000 combinations, requiring 10 bits. If there are 40 dishes, 4000 bits are required to represent all possible configurations of dishes in the dishwasher. The resulting truth table is vast. Each entry in the table indicates how much space is left in the dishwasher if dishes are put in according to the configuration of that entry. A negative number indicates an infeasible solution. There will be many invalid configurations which have two or more dishes occupying the same location. We give all of these entries a large negative number.

The resulting table describes a landscape that is mostly flat with hills sparsely scattered throughout. We can also imagine that this landscape is an ocean in which negative values are under water and positive values represent islands in the ocean. The goal is to find the highest island in the ocean. We start in some random location in the ocean and start searching. We may find an island quickly, but it may not be the highest one. Given the vastness of the ocean, it is understandable why it can take a very long time to find a solution.

But, wait a minute. What about polynomial algorithms like sorting? A truth table can be constructed for these also. For example, to sort 256 elements, we can create 8 bit variables for each element to describe the position of that element in the sorted list. The value of each entry would indicate the number of sorted elements for that configuration. The complete table would again be around 4000 bits and have vast numbers of infeasible solutions in which two or more elements occupy the same slot in the list and only one satisfying solution. Yet, we know finding a solution is easy. Why is this?

The ocean corresponding to the sorting problem is highly regular. If we are put down in an arbitrary point in the ocean, we can immediately determine where to go just be examining the current truth table entry (point in the ocean). Knowing the structure, we may be able to determine from this that we need to go, say, northeast for 1000 miles. We may have to do this some number (but polynomial) times before getting to the solution, but is guaranteed to get to the solution. Structure in a problem allows us to eliminate large parts of the search space efficiently.

In contrast, for an NP-hard problem, there is no guarantee of structure. Furthermore, as we are sailing around this ocean, we are doing so in a thick fog such that we can only see what is immediately around us. We could sail right by an island and not even know it. Given this, it is easy to see that it could take an exponential amount of time to find a solution.

But then, how do we account for the fact that, often, NP-hard problems are tractable? The answer to this question is that there usually is some amount of structure in most problems. We can use heuristics to look for certain patterns. If we find these patterns, then this gives guidance similar to the sorting example above. The problem is that different designs have different patterns and there is no one heuristic that works in all cases. Tools that deal with NP-hard problems usually use many heuristics. The trouble is that, the more heuristics there are, the slower the search. At each step, each of the heuristics needs to be invoked until a pattern match is found. In the worst case, no pattern match will be found meaning it will take an exponential time to do the search, but the search will be much slower due to the overhead of invoking the heuristics at each step.

I hope this gives some intuition into NP-hard problems. In future posts I will talk about even harder classes of problem.

Problems that we use computers to solve can be divided into three basic classes:

  • easy to solve
  • hard to solve
  • impossible to solve

Pretty much all the algorithms we use day-to-day fall into the easy class. Text editing, web browsing, searching, even problems such as weather prediction, which run on supercomputers, are examples of easy to solve problems. These problems are solvable with algorithms that are called polynomial-time (P). This means that, given a problem size of N, an algorithm exists to solve the problem in time that is proportional to a fixed power of N, written as O(N^k). For example, sorting a list of N words is solvable in O(N^2) time.

The class of hard-to-solve problems pretty much includes all problems involved in the creation and verification of a design. Synthesis, place-and-route, formal verification, all fall in this class. You  probably have heard the term NP-complete. NP-complete is just one class of many in the hard-to-solve category. The class of NP-complete problems is the easiest of the hard classes, which means that, often, these types of problems can be solved reasonably well. The hardest of the hard-to-solve classes that you are likely to encounter  is the PSPACE-complete class. Problems in this class are, for all intents of purposes, intractable. We will look at these two classes, plus another one that occurs frequently in design problems, the NP-hard class.

Both the easy and hard-to-solve classes are at least theoretically possible to solve. The last group, the impossible-to-solve group, consist of the set of uncomputable problems. These problems mostly are of theoretical interest, which makes sense, since nobody is going to make any money trying to solve problems that cannot be solved.

One note about complexity classes: I refer to the complexity of problem classes, not algorithms. For example, sorting is a problem, shell sort is an algorithm that solves the sorting problem.  So, saying sorting is an easy problem means that there exists some algorithm that solves it easily. Saying a problem is hard means there is no algorithm that solves it easily.

Since all design and verification related problems fall into the hard-to-solve category, this is what I will talk about most. However, the boundary between easy-to-solve and hard-to-solve problems gets very blurry when we start to look at problems in the real world.

But, before talking about this, let’s first look at a basic property of these different classes – scaling. Suppose we had an algorithm that computed a result for a problem instance of size N (say, sorting a list of size N) in time T. Now suppose we had a processor that ran ten times faster. How much larger problem size could our algorithm handle in the same amount of time T? The following table shows how N grows for a CPU performance increase of 10X for different complexities:

  • linear O(N) N -> 10N
  • O(N lgN) N ->  5.7N
  • O(N^2) N -> 3.2N
  • O(2^N) N -> N+3.4

What this shows is that capacity gains from increased processor performance  diminish with increased problem complexity. For algorithms that have exponential complexity, there is basically no significant gain in capacity from increased processor speed. For example, suppose I had an exponential verification algorithm that could handle 100 variables. Having a 10X faster processor would allow me to process 103 variables in the same amount of time. Considering that Moore’s law implies that the number of variables doubles every year or so, this does not bode well for our ability to keep up.

But, the fact is that verification does  manage to keep up with increased design sizes, albeit with difficulty. So, verification is not always an exponential problem. We will explore the reasons for this in the next post.

Verification is often considered the wicked step-child of design. Most designers want to design because to design is to create. There is not an ounce of creative work that goes into verification. Designers tell verification engineers what tests to write and the verification engineers job is simply to code these tests with little or no input on their part on the design of the tests. It’s not surprising, then, that most engineers want to be designers.

However, this view of verification engineering misses the point. In fact, looked at from a non-engineering point of view, verification can actually be viewed as the more critical task and, more importantly to the verification engineer, the more prestigious task. It all comes down to how one defines the job ‘engineering’ as opposed to, say, ‘scientist’, or ‘artist’. From a societal point-of-view, each one of these jobs is equally important. In fact, it is often the case that scientists look down their noses at engineers because engineers simply use knowledge that scientists discover. Engineers point out that most of today’s technological marvels are created by engineers, not scientists. Artists look down on both since neither of them creates anything beautiful.

The relationship between design and verification is analogous to the relationship between engineers and scientists. Thus, it is understandable that designers can look down on verification and, at the same time, verification can rightly hold a position of respect equal or higher than design.

The task of engineering can be viewed as a combination of art and science. Science is defined as the discovery of knowledge. This is usually taken to mean some fundamental knowledge about the universe, but can, in a broader sense, apply to any knowledge. For example, suppose no one has ever designed an 10G Ethernet controller in 65nm technology and you were tasked with doing exactly that. Clearly, it would be necessary to first gain some knowledge of how to do this. Since no one has done this before, you essentially would be performing the role of a scientist while gaining this knowledge. In fact, all through the design process, most of the work will be taken discovering knowledge about how to design such a product. Generalizing broadly, we can say that 99% of engineering is science.

The primary tool at the scientist’s disposal is the scientific method: hypothesize, perform experiments, conclude results. Usually, hypothesizing and concluding require virtually no time, making experimentation the most time consuming part. The most time consuming part of this effort is designing the experiment. Designing is engineering. Consider the Apollo moon program. The actual science conducted in this project was a few hours gathering moon rocks. Engineering the experimental apparatus, which is to say, the moon rocket required thousands of man-years of effort. Thus, again generalizing broadly, we can say that 99% of science is engineering.

Art is the creation of beautiful objects, engineering is the creation of useful objects. Although the output of these acts is very different, the actual process has many similarities. Most people understand that engineering useful objects is a very rule-bound process. The creation of art is also a very rule-bound process. In painting, beauty comes about because of a detailed understanding of color, composition, materials, and a number of other factors. It is possible for you or I to create art without understanding these rules, but the probability of creating something beautiful is as remote as if one randomly soldered together some resistors, capacitors, and transistors and it ended up being something useful. Music, in particular, has intricate rules that often leave little choice in what notes to choose during composition. It is often thought that genius is the ability to break rules and find new and better rules. In fact, the opposite is true. Whether it be Mozart or Einstein, geniuses do sublime work by strictly following the rules. In fact, it is their deeper understanding of the rules that allows them to see better what can be done.

We can define engineering as combining the creativity of art and the discovery of knowledge of science. Design is the artistic side, but which still requires strict discipline in following rules, which are given by specifications and other design rules. At the same time, design requires figuring out how to design something which has never been designed before. There clearly is some up front scientific discovery needed.

Verification is like science. A verification engineer is given a design, about which very little is known. In particular, it is not known whether there are any bugs in the design or what they might be. The goal of the verification process is to discover any bugs. This, more than anything, is science. Thus, if design brings out the artist in an engineer, verification brings out the scientist. Both are noble.

The training of an engineer is highly disciplined and is done with a heavy dose of the scientific method. Why, then, do engineers disdain verification so much? Much of the blame can be laid on the need for instant gratification. At the very beginning, both verification and design require some amount of discovery of knowledge in the form of studying the specification. After this, the first step of design is purely creative, the fun part of design. The first step in verification is pure drudgery, setting up test benches, scripts, and writing endless tests. After the creative part of the design is done, the rest is pure drudgery. The fun part of verification occurs after all the drudgery is complete, and that is when the verification engineer starts discovering bugs in the design.

There is no greater joy than discovering a bug that reveals some fundamental mis-understanding about the design. This new knowledge can have a dramatic effect on the design teams perception of the design. This effect is not unlike that of a scientist discovering something that contradicts some well-established principle of the natural world. The most famous example of this is probably Einstein’s Theory of Relativity which changed how we view fundamental laws of physics. This analogy is more instructive than appears. Newtonian mechanics work most of the time. Relativity corrects those cases in which it fails. But, Relativity does not simply tack additional refinements on to Newtonian mechanics. It provides a different point of view and shows how Newtonian mechanics are a special case of this new point of view. Although many bugs found during verification are simple refinements of something that is correct, bugs are found which show the true principles of how a design works and that the initial design was simply an instance of these principles that happened to work most of the time. It is these bugs that provide the real satisfaction to verification engineers.

To find a Buddha, you have to find your nature.
– Bloodstream Sermon

We perceive the world as an abstraction of reality. When we look at a tree, we see a tree, we don’t see all the individal branches and leaves that make it up. Even if we viewed a tree that way, branches and leaves are still abstractions. Even if it were somehow possible to view a tree as all the individual molecules that make it up, that is still an abstraction. There is no escaping abstraction in how we relate to the world.

A monk asked Dongshan Shouchu, “What is Buddha?” Dongshan said, “Three pounds of flax.”
– Zen Koan

The core of Zen Buddhism is trying to see past the abstractions that our minds crave so desperately in order to make sense of reality. There are very few people who find enlightenment because of the power that abstraction holds over our minds. To the enlightened, those who see reality as it is, questions framed in terms of orthodox abstractions make no sense.

This is the fundamental issue that makes creating anything hard. What we create is real, but how we view it is through the lens of abstraction, and there is no escaping this. We generally believe that we conceive at a high level of abstraction and implement at a low(er) level of abstraction. We strive to increase the level of abstraction because we believe this is the way to improve productivity.

Conventional Wisdom

Productivity vs. Abstraction Level: Conventional Wisdom

This turns out not to be entirely true. The closest concrete representation of how we conceive of a design is the written specification. As reviled as it is, this document best represents our intentions at the level of abstraction that we conceive of the design. But, if we look at an average written spec., we find that it encompasses many levels of abstraction. We find:

  • textual descriptions that are written at a very high level of abstraction, “it’s a bus that has a processor, memory controller, and I/O controller sitting on it.”
    • structural, behavioral, data, and temporal abstraction.
  • Block diagrams of major subunits showing how they are connected.
    • unabstracted structure at a high level, structural abstraction of blocks, behavioral, data and temporal abstraction.
  • Equations, code snippets, gate-level diagrams.
    • basically no abstraction.
  • truth tables
    • structural abstraction only
  • waveforms
    • structural abstraction, behavioral abstraction, unabstracted data, time.

and many other levels of abstraction in between. What this indicates is that as we conceive of a design, we jump around to different levels of abstraction as we consider different aspects of the design. And this holds even if we try to move the level of abstraction up. A new abstraction level just adds to the set of abstraction levels that we can use. What this means is that there is a law of diminishing returns as we move up in abstraction.

Law of Diminishing Returns

Productivity vs. Abstraction: Law of Diminishing Returns

Even more importantly, productivity doesn’t increase monotonically as we raise the level of abstraction. Design is the process of translating from the abstractions that we conceive to those required by the implementation. The greater the gap between these, the lower our productivity in implementing the design.

The conventional wisdom that productivity increases as abstraction increases is based on analyzing just two points on the abstaction continuum. If we plot productivity vs. abstraction level, the only points that we really can count on to improve productivity are those that are close to the natural abstractions at which we conceive.

Accounting for Translation from Natural to Implementation Abstraction Level

Productivity vs. Abstraction: Accounting for Translation from Natural to Implementation Abstraction Level

Productivity falls off rapidly as the gap between implementation and conception abstraction levels increases. This is one of the reasons that it has been such a struggle to raise the level of abstraction in design. Finding the right fit that matches how we think at the highest levels of abstraction is exceedingly difficult, but is key if we want to continue to improve productivity by raising the level of abstraction at which we design.

The obvious solution to the problem of increasing complexity is to raise the level of abstraction at which we design. If we look at the languages used to design software and hardware, the level of abstraction has not increased significantly in over 30 years in software and 20 years in hardware. Why is this if it is such an obvious and compelling solution to the complexity problem?

To understand this, we first must understand why raising the abstraction level increases productivity. In the “Mythical Man-Month”, Brooks concludes that the number of bugs per line of code is constant regardless of what level of abstraction you are working at. Designing at higher levels of abstraction requires less code for the same functionality. Therefore there are fewer bugs per function when coding at a higher level of abstraction, resulting in higher productivity.

So, the goal of abstraction is to reduce the amount of code written to achieve the same level of functionality. The improvement in productivity was dramatic when the move was made from assembly language to the first high-level languages such as FORTRAN and Algol. The procedural programming language, C, became the dominant language in the seventies. C maintained its dominance despite the introduction of languages such as APL and LISP, which did significantly increase the level of abstraction.

Since then, object-oriented programming (OOP) has become the standard programming methodology. But, does OOP represent an increase in the level of abstraction? We can answer this by looking at the four types of abstraction. OOP languages have the same level of structural abstraction as non-OOP procedural languages. The fact that OOP allows member functions and private data doesn’t change the level of structural abstraction, it just allows abstraction errors to be detected at compile time. There is no difference in behavioral abstraction. Both methodologies use the same data abstractions so there is no difference there and temporal abstraction does not really apply to programming languages. Thus, they are at the same level of abstraction.

Have there been any increases in the level of abstraction in programming that have had a significant impact? Scripting languages such as Perl, in which variables are declared implictly, are a form of data abstraction since the implicit declaration hides these details. Garbage collected languages such as Java hide the details of construction/destruction. Templates, such as is found in C++, are another form of data abstraction. All of these innovations have had some effect on improving productivity, but none has had the dramatic impact that the move from assembly to high-level languages had. In summary, there has been no sigificant increase in the level of abstraction of programming languages in over 30 years of software design.

In hardware design, the story is similar. Initially, IC design entry was done by drawing the masks directly. A significant jump in abstraction level occurred when most designs were entered at the gate-level with tools automating the creation of masks. RTL synthesis was the next major jump in abstraction level. Since then, there have been attempts to increase the abstraction level, primarily by trying to do temporal abstraction, but these have not caught on.

We are left with the question of: how have we been able to design much more complex systems with languages that have not increased in their level of abstraction in decades?

The answer is reuse. Today, no complex software or hardware design is built from scratch. We use either standard building blocks or IP blocks, either externally or internally developed. Today, SOCs are designed which consist of a small amount of custom-designed logic which glues together a number of existing IP blocks. The software that runs on these SOCS consists of off-the-shelf operating systems, drivers, and libraries with a few simple applications. Using this methodology, very complex devices can be developed very quickly.

Does this mean that abstraction is not the answer or is not even relevant? No. When we analyze this, we find that reuse is just another way of raising the abstraction level. An IP block or library is an implementation of some behavior. The block itself is a black box; that is, we don’t care how it is implemented. In abstraction terms, when using this block, we have effectively abstracted away all its structure and only care about its behavior. Thus, reuse is a way of dramatically increasing the level of structural abstraction. This is why the abstraction level has not raised in languages in many years. There has been no need because reuse has achieved the necessary results.

Reuse is likely to continue to be the dominant method of raising the abstraction level for the foreseeable future. It is not without its problems, however. First, IP blocks and libraries are generally very inflexible. IP reuse works best when fitting a round peg into a round hole and the benefit/cost ratio drops off rapidly the worse the fit. If the IP block that is available doesn’t quite fit what you want, the temptation is to use it anyway. More often than not, it requires less effort to design a round peg from scratch than try to make a square peg fit, but designers are forced to reuse an ill fitting block anyway.

The other problem with reuse is with verification. From the designer’s viewpoint an IP/library block is a black box. But from a verification standpoint, it is not. First, the IP block is implemented at a low level of abstraction. Therefore, when executing the code for verification purposes, execution occurs at the lowest level of abstraction, making run time a bottleneck. Second, it is usually necessary to exercise all of the code in the IP/library block to ensure correctness. But, this is a very time consuming task because the low level of abstraction means that there are many cases that need to be tested.

In conclusion, from a design standpoint, reuse is the best way today of achieving increasingly higher levels of abstraction. From a verification standpoint, the issues with reuse fuel a desire to find languages that raise the level of abstraction and this is true even at the block level. Because language decisions are primarily driven by design considerations rather than verification, there has been no incentive to increase the level of abstraction in languages, which is why methodologies such as behavioral synthesis and ESL have not caught on.

In my previus post on abstraction, we were left with the question of what makes a valid abstraction. For example, suppose we have a gate-level design of an adder. If we write:

    a = b * c;

we can see that this is a higher level of abstraction, but we would not generally consider it a valid abstraction of an adder. A valid abstraction would be:

    a = b + c;

In general, functions are not as simple as adders and multipliers. We need to define some way of determining whether a design is a valid abstraction of another design.

A valid abstraction is one that preserves some property of the original design.

In RTL, the property of interest is that the abstract function should produce the same output for all inputs, i.e. functionality is preserved. What abstractions are being used in RTL? Let’s look at our adder example. There is structural abstraction because we have eliminated all gates. There is data abstraction from bits to bit vectors. There is temporal abstraction if you consider that the gates have non-zero delays. There is no behavioral abstraction because values are specified for all possible input values. This is a valid abstraction if all output values of the abstraction are the same as the non-abstracted version.

Abstraction works like a less-than-or-equal (<=) operator. Suppose you have designs A and B. If A is an abstraction of B, we can write A <= B. Suppose also that B is an abstraction of C, B <= C. We know that if A <= B and B <= C, then A <= C. This also holds for abstractions. Since abstraction is the hiding of irrelevant detail, you can think of the less than relation as meaning “less detailed than”.

Suppose you have designs D, E, and F, with D<=E and D<=F. We know D is an abstraction of E and F, but what do we know about the relative abstraction levels between E and F? We cannot consider one as being an abstraction of the other even though they are equivalent in some sense. This type of relationship is important in design where D is a specification and E and F are different implementations of the same specification.

The flip side of this is: suppose we have designs P, Q, and R, with P<=R and Q<=R. In other words, P and Q are different abstractions of R. Again there is nothing we can say about the relative abstraction levels between P and Q. This relationship is important in verification where P and Q are different abstract models of the design R.

One last note: it is generally an intractable problem to prove that an abstraction is valid. If you are familiar with equivalence checking, this is basically a method to prove that an abstraction is valid.

Abstraction: the suppression of irrelevant detail

Abstraction is the single most important tool in designing complex systems. There is simply no way to design a million lines of code, whether it be hardware or software, without using multiple levels of abstraction. But, what exactly is abstraction? Most designers know intuitively that, for example, a high-level programming language ,such as C, is a higher level of abstraction than assembly language. Equivalently, in hardware, RTL is a higher level abstraction than gate-level. However, few designers understand the theoretical basis for abstraction. If we believe that the solution to designing ever more complex systems is higher levels of abstraction, then it is important to understand the basic theory of what makes one description of a design more or less abstract than another.

There are four types of abstraction that are used in building hardware/software systems:

  • structural
  • behavioral
  • data
  • temporal

Structural Abstraction

Structure refers to the concrete objects that make up a system and their composition For example, the concrete objects that make up a chip are gates. If we write at the RTL level of abstraction:

    a = b + c;

this is describing an adder, but the details of all the gates and their connections is suppressed because they are not relevant at this level of description. In software, the concrete objects being hidden are the CPU registers, program counter, stack pointer, etc. For example, in a high-level language, a function call looks like:

    foo(a,b,c);

The equivalent machine-level code will have instructions to push and pop operands and jump to the specified subroutine. The high-level language hides these irrelevant details.

In general, structural abstraction means specifying functions in terms of inputs and outputs only. Structural abstraction is the most fundamental type of abstraction used in design. It is what enables a designer to enter large designs.

Behavioral Abstraction

Abstracting behavior means not specifying what should happen for certain inputs and/or states. Behavioral abstraction can really only be applied to functions that have been structurally abstracted. Structural abstraction means that a function is specified by a table mapping inputs to outputs. Behavioral abstraction means that the table is not completely filled in.

Behavioral abstraction is not used in design, but is extremely useful, in fact, necessary, in verification. Verification engineers instinctively use behavioral abstraction without even realizing it. A verification environment consists of two parts: a generator that generates input stimulus, and a checker, which checks that the output is correct. It is very common for checkers not to be able to check the output for all possible input values. For example, it is common to find code such as:

    if (response_received)
        if (response_data != expected_data)
           print("ERROR");

The checker only specifies the correct behavior if a response is received. It says nothing about the correct behavior if no response is received.

A directed test is an extreme example of behavioral abstraction. Suppose I write the following directed test for an adder:

    a = 2;
    b = 2;
    dut_adder(out,a,b);
    if (out != 4)
       print("ERROR");

The checker is the last two lines, but it only specifies the output for inputs, a=2, b=2, and says nothing about any other input values.

Data Abstraction

Data abstraction is a mapping from a lower-level type to a higher-level type. The most obvious data abstraction, which is common to both hardware and software, is the mapping of an N-bit vector onto the set of integers. Other data abstractions exist. In hardware, a binary digit is an abstraction of the analog values that exist on a signal. In software, a struct is an abstraction of its individual members.

An interesting fact about data abstraction is that the single most important abstraction, from bit vector to integer, is not actually a valid abstraction. When we treat values as integers, we expect that they obey the rules or arithmetic, however, fixed with bit vectors do not, specifically when operations overflow. To avoid this, a bit width is chosen such that no overflow is possible, or special overflow handling is done.

Temporal Abstraction

This last abstraction type really only applies to hardware. Temporal abstraction means ignoring how long it takes to perform a function. A simple example of this is the zero-delay gate model often used in gate-level simulations. RTL also assumes all combinational operations take zero time.

It is also possible to abstract cycles. For example, a pipelined processor requires several cycles to complete an operation. In verification, it is common to create an unpipelined model of the processor that completes all operations in one cycle. At the end of a sequence of operations, the architecturally visible state of the two models should be the same. This is useful because an unpipelined model is usually much simpler to write than a pipelined one.

The four abstractions described above comprise a basis for the majority of abstractions used in design and verification. That is, any abstraction we are likely to encounter is some combination of the above abstractions. However, we are still left with the question of what is a valid abstraction. I will defer answering this until the next post.

(note: this discussion is based on the paper, “Abstraction Mechanisms for Hardware Verification” by Tom Melham. There is slightly more high-level description of these abstractions in this paper along with a lot of boring technical detail that you probably want to skip.)

I am a big fan of Fred Brooks’ “The Mythical Man-Month: Essays on Software Engineering”. Brooks was leader of one of the first large software projects. Along the way, he found that a lot of the conventional wisdom about software engineering was wrong, most famously coming up with the idea that adding manpower to a late project makes it later.

I have also found that a lot of the conventional wisdom about the causes of verification difficulties is wrong. So, in designing this blog, I decided to model it after the Mythical Man-Month. The essence of this style is:

  • short, easy-to-read essays on a single topic.
  • timelessness – focus on overarching issues, not on how to solve specific problems with specific programming languages.
  • back it up with real data whenever possible.

In some senses, this blog will attempt to fill in some obvious holes in the Mythical Man-Month. Brooks states that project effort can be roughly apportioned as:

  • 1/3 planning (specification)
  • 1/6 coding
  • 1/2 verification

but then proceeds to talk mostly about planning and coding and very little about verification. I think this blog will cover each of these areas in rough proportion to these numbers, so most of my posts will be on verification, but a fair number will cover specification and some on particular design issues.

Brooks’ work is more than 30 years old, so it is worth re-examining some of his conclusions to see if they still hold up as design complexity has increased with time. One of the areas of contention is the percentage of time spent in verification. Brooks’ claim of verification taking 50% of the effort applied to large software projects. Today, there are claims that hardware verification is taking 70% of the effort. EDA vendors often point to this “growth” as proof that verification is becoming the bottleneck.

But, is this the real story? Software verification is mostly done by the designers. In this kind of environment, verification consumes roughly 50% of the total effort. 20 years ago, Hardware verification was also roughly 50% of the effort because it was mostly the designers doing the verification. The shift to pre-silicon verifcation that came about due to the advent of HDLs and synthesis enabled the separation of verification and design. But, separate verification is not as efficient as having the designer do the verification. So, now verification is 70% of the effort instead of 50%. So, rather than growing from 50% to 70% of more, it was more of a one time jump due to the shift in methodology. But, whether it is 50% or 70%, verification is the largest single piece of the overall design effort.

I wrote an article addressing this subject in more detail titled, “Leveraging Design Insight for Intelligent Verification Methodologies”. You can download it from the Nusym website