Pipelining in computer architecture
In computers, a pipeline is the continuous and somewhat overlapped movement of instruction to the processor or in the arithmetic steps taken by the processor to perform an instruction. Pipelining is the use of a pipeline. Without a pipeline, a computer processor gets the first instruction from memory, performs the operation it calls for, and then goes to get the next instruction from memory, and so forth. While fetching (getting) the instruction, the arithmetic part of the processor is idle. It must wait until it gets the next instruction. With pipelining, the computer architecture allows the next instructions to be fetched while the processor is performing arithmetic operations, holding them in a buffer close to the processor until each instruction operation can be performed. The staging of instruction fetching is continuous. The result is an increase in the number of instructions that can be performed during a given time period.
Pipelining is sometimes compared to a manufacturing assembly line in which different parts of a product are being assembled at the same time although ultimately there may be some parts that have to be assembled before others are. Even if there is some sequential dependency, the overall process can take advantage of those operations that can proceed concurrently.
Computer processor pipelining is sometimes divided into an instruction pipeline and an arithmetic pipeline. The instruction pipeline represents the stages in which an instruction is moved through the processor, including its being fetched, perhaps buffered, and then executed. The arithmetic pipeline represents the parts of an arithmetic operation that can be broken down and overlapped as they are performed.
Pipelines and pipelining also apply to computer memory controllers and moving data through various memory staging places.
Overview of pipelining computer processors
The purpose of this document is to provide an overview of pipelining computer processors. The topic will be covered in general with a focus on some special topics of interest. One area of focus is some of the early criticisms of pipelining and why, in retrospect, they were wrong. The emergence and evolution of pipelining in the early IBM line of mainframe computers, including the IBM 7030, better known as the "Stretch" computer will be discussed. This will be contrasted with the state of pipelining in the current generation of microprocessors from Intel and Motorola.
A Pipeline is a series of stages, where some work is done at each stage. The work is not finished until it has passed through all stages. Pipelining is an implementation technique in which multiple instructions are overlapped in execution. Today, Pipelining is key to making processors fast. A pipeline is like an assembly line: in both, each step completes one piece of the whole job. Workers on a car assembly line perform small tasks, such as installing seat covers. The power of the assembly line comes from many cars per day. On a well-balanced assembly line, a new car exits the line in the time it takes to perform one of the many steps. Note that the assembly line does not reduce the time it takes to complete an individual car; it increases the number of cars being built simultaneously and thus the rate at which the cars are started and completed. There are two types of pipelines, Instructional pipeline where different stages of an instruction fetch and execution are handled in a pipeline and Arithmetic pipeline where different stages of an arithmetic operation are handled along the stages of a pipeline.
Pipelining is used to obtain improvements in processing time that would be unobtainable with existing non-pipelined technology. The development goal for the IBM 7030 (the Stretch Computer) was an over-all performance of 100 times the 704 computer, the fastest computer in production at that time, whereas circuit improvements would
only give a factor-of-10 improvement. This goal could only be met with overlapping instructions, i.e. pipelining.
A Pipeline is used to improve performance beyond what can be achieved with non-pipelined processing. Similarly, the goal for the IBM 360/91 was an improvement of one to two orders of magnitude over the 7090. Technology advances could only bring about a four-fold improvement.
In a more recent example, the 6502 microprocessor had a through-put similar to the 8080 processor running at a clock rate four times faster. This was due to the pipelined architecture of the 6502 versus the non-pipelined 8080.
There are two disadvantages of pipeline architecture. The first is complexity. The second is the inability to continuously run the pipeline at full speed, i.e. the pipeline stalls. There are many reasons as to why pipeline cannot run at full speed. There are phenomena called pipeline hazards, which disrupt the smooth execution of the pipeline. The resulting delays in the pipeline flow are called bubbles. These pipeline hazards include structural hazards from hardware conflicts data hazards arising from data dependencies control hazards that come about from branch, jump, and other control flow changes
These issues can and are successfully dealt with. But detecting and avoiding the hazards leads to a considerable increase in hardware complexity. The control paths controlling the gating between stages can contain more circuit levels than the data paths being controlled. In 1970, this complexity is one reason that led Foster to call pipelining still controversial .
The one major idea that is still controversial is "instruction look-ahead" [pipelining].
Why then the controversy? First, there is a considerable increase in hardware complexity. The second problem when a branch instruction comes along, it is impossible to know in advance of execution which path the program is going to take and, if the machine guesses wrong, all the partially processed instructions in the pipeline are useless and must be replaced. In the second edition of Foster's book, published 1976, this passage was gone. Apparently, Foster felt that pipelining was no longer controversial.
Doran also alludes to the nature of the problem. The model of pipelining is "amazingly simple" while the implementation is "very complex" and has many complications .
Because of the multiple instructions that can be in various stages of execution at any given moment in time, handling an interrupt is one of the more complex tasks. In the IBM 360, this can lead to several instructions executing after the interrupt is signaled, resulting in an imprecise interrupt. An imprecise interrupt can result from an instruction exception and precise address of the instruction causing the exception may not be known! This led Myers to criticize pipelining, referring to the imprecise interrupt as an "architectural nuisance". He stated that it was not an advance in computer architecture but an improvement in implementation that could be viewed as a step backward.
In retrospect, most of Myers' book Advances in Computer Architecture dealt with his concepts for improvements in computer architecture that would be termed CISC today. With the benefits of hindsight, we can see that pipelining is here today and that most of the new CPUs are in the RISC class. In fact, Myers is one of the co-architects of Intel's series of 32-bit RISC microprocessors. This processor is fully pipelined [MYE88]. I suspect that Myers no longer considers pipelining a step backwards.
The difficulty arising from imprecise interrupts should be viewed as a complexity to be overcome, not as an inherent flaw in pipelining. Doran explains how the B7700 carries the address of the instruction through the pipeline, so that any exception that the instruction may raise can be precisely located and not generate an imprecise interrupt .