An asynchronous circuit is a circuit in which the parts are largely autonomous. They are not governed by a clock circuit or global clock signal, but instead need only wait for the signals that indicate completion of instructions and operations. These signals are specified by simple data transfer protocols. This digital logic design is contrasted with a synchronous circuit which operates according to clock timing signals.
Petri Nets are an attractive and powerful model for reasoning about asynchronous circuits. However Petri nets have been criticized by Carl Hewitt for their lack of physical realism (see Petri net#Subsequent models of concurrency). Subsequent to Petri nets other models of concurrency have been developed that can model asynchronous circuits including the Actor model and process calculi.
The term asynchronous logic is used to describe a variety of design styles, which use different assumptions about circuit properties. These vary from the bundled delay model - which uses 'conventional' data processing elements with completion indicated by a locally generated delay model - to delay-insensitive design - where arbitrary delays through circuit elements can be accommodated. The latter style tends to yield circuits which are larger and slower than synchronous (or bundled data) implementations, but which are insensitive to layout and parametric variations and are thus "correct by design."
Different classes of asynchronous circuitry offer different advantages. Below is a list of the advantages offered by Quasi Delay Insensitive Circuits, generally agreed to be the most "pure" form of asynchronous logic that retains computational universality. Less pure forms of asynchronous circuitry offer better performance at the cost of compromising one or more of these advantages:
* Robust handling of metastability of arbiters.
* Early Completion of a circuit when it is known that the inputs which have not yet arrived are irrelevant.
* Possibly lower power consumption because no transistor ever transitions unless it is performing useful computation (clock gating in synchronous designs is an imperfect approximation of this ideal). Also, clock drivers can be removed which can significantly reduce power consumption. However, when using certain encodings, asynchronous circuits may require more area, which can result in increased power consumption if the underlying process has poor leakage properties (for example, deep submicrometer processes used prior to the introduction of high-K dielectrics).
* Freedom from the ever-worsening difficulties of distributing a high-fanout, timing-sensitive clock signal.
* Better modularity and composability.
* Far fewer assumptions about the manufacturing process are required (most assumptions are timing assumptions).
* Circuit speed is adapted on the fly to changing temperature and voltage conditions rather than being locked at the speed mandated by worst-case assumptions.
* Immunity to transistor-to-transistor variability in the manufacturing process, which is one of the most serious problems facing the semiconductor industry as dies shrink.
* Less severe electromagnetic interference. Synchronous circuits create a great deal of EMI in the frequency band at (or very near) their clock frequency and its harmonics; asynchronous circuits generate EMI patterns which are much more evenly spread across the spectrum.
* In asynchronous circuits, local signaling eliminates the need for global synchronization which exploits some potential advantages in comparison with synchronous ones. They have shown potential specifications in low power consumption, design reuse, improved noise immunity and electromagnetic compatibility. Asynchronous circuits are more tolerant to process variations and external voltage fluctuations.
* Increased Complexity
* More Difficult to Design
* the performance analysis of asynchronous circuits is a complicated problem
Asynchronous CPUs are one of several ideas for radically changing CPU design.
Unlike a conventional processor, a clockless processor (asynchronous CPU) has no central clock to coordinate the progress of data through the pipeline. Instead, stages of the CPU are coordinated using logic devices called "pipeline controls" or "FIFO sequencers." Basically, the pipeline controller clocks the next stage of logic when the existing stage is complete. In this way, a central clock is unnecessary. It may actually be even easier to implement high performance devices in asynchronous, as opposed to clocked, logic:
* components can run at different speeds on an asynchronous CPU; all major components of a clocked CPU must remain synchronized with the central clock;
* a traditional CPU cannot "go faster" than the expected worst-case performance of the slowest stage/instruction/component. When an asynchronous CPU completes an operation more quickly than anticipated, the next stage can immediately begin processing the results, rather than waiting for synchronization with a central clock. An operation might finish faster than normal because of attributes of the data being processed (e.g., multiplication can be very fast when multiplying by 0 or 1, even when running code produced by a naive compiler), or because of the presence of a higher voltage or bus speed setting, or a lower ambient temperature, than 'normal' or expected.
Asynchronous logic proponents believe these capabilities would have these benefits:
* lower power dissipation for a given performance level, and
* highest possible execution speeds.
The biggest disadvantage of the clockless CPU is that most CPU design tools assume a clocked CPU (i.e., a synchronous circuit). Many tools "enforce synchronous design practices". Making a clockless CPU (designing an asynchronous circuit) involves modifying the design tools to handle clockless logic and doing extra testing to ensure the design avoids metastable problems. The group that designed the aforementioned AMULET, for example, developed a tool called LARD to cope with the complex design of AMULET3.
Despite the difficulty of doing so, numerous asynchronous CPUs have been built, including:
* the ORDVAC (?) and the (identical) ILLIAC I (1951),
* the ILLIAC II (1962);
* The Caltech Asynchronous Microprocessor, the world-first asynchronous microprocessor (1988);
* the ARM-implementing AMULET (1993 and 2000);
* the asynchronous implementation of MIPS R3000, dubbed MiniMIPS (1998);
* the SEAforth multi-core processor (2008) from Charles H. Moore.
The ILLIAC II was the first completely asynchronous, speed independent processor design ever built; it was the most powerful computing machine known to man at the time.
DEC PDP-16 Register Transfer Modules (ca. 1973) allowed the experimenter to construct asynchronous, 16-bit processing elements. Delays for each module were fixed and based on the module's worst-case timing.
The Caltech Asynchronous Microprocessor (1988) was the first asynchronous microprocessor (1988). Caltech designed and manufactured the world's first fully Quasi Delay Insensitive processor. During demonstrations, the researchers amazed viewers by loading a simple program which ran in a tight loop, pulsing one of the output lines after each instruction. This output line was connected to an oscilloscope. When a cup of hot coffee was placed on the chip, the pulse rate (the effective "clock rate") naturally slowed down to adapt to the worsening performance of the heated transistors. When liquid nitrogen was poured on the chip, the instruction rate shot up with no additional intervention. Additionally, at lower temperatures, the voltage supplied to the chip could be safely increased, which also improved the instruction rate -- again, with no additional configuration.